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Foreword 


Beginning in the spring of 2000, a series of four one-semester courses 
were taught at Princeton University whose purpose was to present, in 
an integrated manner, the core areas of analysis. The objective was to 
make plain the organic unity that exists between the various parts of the 
subject, and to illustrate the wide applicability of ideas of analysis to 
other fields of mathematics and science. The present series of books is 
an elaboration of the lectures that were given. 

While there are a number of excellent texts dealing with individual 
parts of what we cover, our exposition aims at a different goal: pre¬ 
senting the various sub-areas of analysis not as separate disciplines, but 
rather as highly interconnected. It is our view that seeing these relations 
and their resulting synergies will motivate the reader to attain a better 
understanding of the subject as a whole. With this outcome in mind, we 
have concentrated on the main ideas and theorems that have shaped the 
field (sometimes sacrificing a more systematic approach), and we have 
been sensitive to the historical order in which the logic of the subject 
developed. 

We have organized our exposition into four volumes, each reflecting 
the material covered in a semester. Their contents may be broadly sum¬ 
marized as follows: 

I. Fourier series and integrals. 

II. Complex analysis. 

III. Measure theory, Lebesgue integration, and Hilbert spaces. 

IV. A selection of further topics, including functional analysis, distri¬ 
butions, and elements of probability theory. 

However, this listing does not by itself give a complete picture of 
the many interconnections that are presented, nor of the applications 
to other branches that are highlighted. To give a few examples: the ele¬ 
ments of (finite) Fourier series studied in Book I, which lead to Dirichlet 
characters, and from there to the infinitude of primes in an arithmetic 
progression; the X-ray and Radon transforms, which arise in a number of 
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problems in Book I, and reappear in Book III to play an important role in 
understanding Besicovitch-like sets in two and three dimensions; Fatou’s 
theorem, which guarantees the existence of boundary values of bounded 
holomorphic functions in the disc, and whose proof relies on ideas devel¬ 
oped in each of the first three books; and the theta function, which first 
occurs in Book I in the solution of the heat equation, and is then used 
in Book II to find the number of ways an integer can be represented as 
the sum of two or four squares, and in the analytic continuation of the 
zeta function. 

A few further words about the books and the courses on which they 
were based. These courses where given at a rather intensive pace, with 48 
lecture-hours a semester. The weekly problem sets played an indispens¬ 
able part, and as a result exercises and problems have a similarly im¬ 
portant role in our books. Each chapter has a series of “Exercises” that 
are tied directly to the text, and while some are easy, others may require 
more effort. However, the substantial number of hints that are given 
should enable the reader to attack most exercises. There are also more 
involved and challenging “Problems ”； the ones that are most difficult, or 
go beyond the scope of the text, are marked with an asterisk. 

Despite the substantial connections that exist between the different 
volumes, enough overlapping material has been provided so that each of 
the first three books requires only minimal prerequisites: acquaintance 
with elementary topics in analysis such as limits, series, differentiable 
functions, and Riemann integration, together with some exposure to lin¬ 
ear algebra. This makes these books accessible to students interested 
in such diverse disciplines as mathematics, physics, engineering, and 
finance, at both the undergraduate and graduate level. 

It is with great pleasure that we express our appreciation to all who 
have aided in this enterprise. We are particularly grateful to the stu¬ 
dents who participated in the four courses. Their continuing interest, 
enthusiasm, and dedication provided the encouragement that made this 
project possible. We also wish to thank Adrian Banner and Jose Luis 
Rodrigo for their special help in running the courses, and their efforts to 
see that the students got the most from each class. In addition, Adrian 
Banner also made valuable suggestions that are incorporated in the text. 
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Preface to Book I 


Any effort to present an overall view of analysis must at its start deal 
with the following questions: Where does one begin? What are the initial 
subjects to be treated, and in what order are the relevant concepts and 
basic techniques to be developed? 

Our answers to these questions are guided by our view of the centrality 
of Fourier analysis, both in the role it has played in the development of 
the subject, and in the fact that its ideas permeate much of the present- 
day analysis. For these reasons we have devoted this first volume to an 
exposition of some basic facts about Fourier series, taken together with 
a study of elements of Fourier transforms and finite Fourier analysis. 
Starting this way allows one to see rather easily certain applications to 
other sciences, together with the link to such topics as partial differential 
equations and number theory. In later volumes several of these connec¬ 
tions will be taken up from a more systematic point of view, and the ties 
that exist with complex analysis, real analysis, Hilbert space theory, and 
other areas will be explored further. 

In the same spirit, we have been mindful not to overburden the begin¬ 
ning student with some of the difficulties that are inherent in the subject: 
a proper appreciation of the subtleties and technical complications that 
arise can come only after one has mastered some of the initial ideas in¬ 
volved. This point of view has led us to the following choice of material 
in the present volume: 

• Fourier series. At this early stage it is not appropriate to intro¬ 
duce measure theory and Lebesgue integration. For this reason 
our treatment of Fourier series in the first four chapters is carried 
out in the context of Riemann integrable functions. Even with this 
restriction, a substantial part of the theory can be developed, de¬ 
tailing convergence and summability; also, a variety of connections 
with other problems in mathematics can be illustrated. 

• Fourier transform. For the same reasons, instead of undertaking 
the theory in a general setting, we confine ourselves in Chapters 5 
and 6 largely to the framework of test functions. Despite these lim¬ 
itations, we can learn a number of basic and interesting facts about 
Fourier analysis in and its relation to other areas, including the 
wave equation and the Radon transform. 
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xii PREFACE TO BOOK I 

• Finite Fourier analysis. This is an introductory subject par excel¬ 
lence, because limits and integrals are not explicitly present. Nev¬ 
ertheless, the subject has several striking applications, including 
the proof of the infinitude of primes in arithmetic progression. 

Taking into account the introductory nature of this first volume, we 
have kept the prerequisites to a minimum. Although we suppose some 
acquaintance with the notion of the Riemann integral, we provide an 
appendix that contains most of the results about integration needed in 
the text. 

We hope that this approach will facilitate the goal that we have set 
for ourselves: to inspire the interested reader to learn more about this 
fascinating subject, and to discover how Fourier analysis affects decisively 
other parts of mathematics and science. 
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The Genesis of Fourier 
Analysis 


Regarding the researches of d’Alembert and Euler could 
one not add that if they knew this expansion, they 
made but a very imperfect use of it. They were both 
persuaded that an arbitrary and discontinuous func¬ 
tion could never be resolved in series of this kind, and 
it does not even seem that anyone had developed a 
constant in cosines of multiple arcs, the first problem 
which I had to solve in the theory of heat. 

J. Fourier, 1808-9 


In the beginning, it was the problem of the vibrating string, and the 
later investigation of heat flow, that led to the development of Fourier 
analysis. The laws governing these distinct physical phenomena were 
expressed by two different partial differential equations, the wave and 
heat equations, and these were solved in terms of Fourier series. 

Here we want to start by describing in some detail the development 
of these ideas. We will do this initially in the context of the problem of 
the vibrating string, and we will proceed in three steps. First, we de¬ 
scribe several physical (empirical) concepts which motivate correspond¬ 
ing mathematical ideas of importance for our study. These are: the role 
of the functions cos 亡 ， sin t, and e lt suggested by simple harmonic mo¬ 
tion; the use of separation of variables, derived from the phenomenon 
of standing waves; and the related concept of linearity, connected to the 
superposition of tones. Next, we derive the partial differential equation 
which governs the motion of the vibrating string. Finally, we will use 
what we learned about the physical nature of the problem (expressed 
mathematically) to solve the equation. In the last section, we use the 
same approach to study the problem of heat diffusion. 

Given the introductory nature of this chapter and the subject matter 
covered, our presentation cannot be based on purely mathematical rea¬ 
soning. Rather, it proceeds by plausibility arguments and aims to provide 
the motivation for the further rigorous analysis in the succeeding chap¬ 
ters. The impatient reader who wishes to begin immediately with the 
theorems of the subject may prefer to pass directly to the next chapter. 
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2 Chapter 1. THE GENESIS OF FOURIER ANALYSIS 

1 The vibrating string 

The problem consists of the study of the motion of a string fixed at 
its end points and allowed to vibrate freely. We have in mind physical 
systems such as the strings of a musical instrument. As we mentioned 
above, we begin with a brief description of several observable physical 
phenomena on which our study is based. These are: 

• simple harmonic motion, 

• standing and traveling waves, 

• harmonics and superposition of tones. 

Understanding the empirical facts behind these phenomena will moti¬ 
vate our mathematical approach to vibrating strings. 

Simple harmonic motion 

Simple harmonic motion describes the behavior of the most basic oscil¬ 
latory system (called the simple harmonic oscillator), and is therefore 
a natural place to start the study of vibrations. Consider a mass {m} 
attached to a horizontal spring, which itself is attached to a fixed wall, 
and assume that the system lies on a frictionless surface. 

Choose an axis whose origin coincides with the center of the mass when 
it is at rest (that is, the spring is neither stretched nor compressed), as 
shown in Figure 1. When the mass is displaced from its initial equilibrium 




Figure 1. Simple harmonic oscillator 


position and then released, it will undergo simple harmonic motion. 
This motion can be described mathematically once we have found the 
differential equation that governs the movement of the mass. 

Let y(t) denote the displacement of the mass at time t. We assume that 
the spring is ideal, in the sense that it satisfies Hooke’s law: the restoring 
force F exerted by the spring on the mass is given by F = —ky(t). Here 
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1. The vibrating string 3 

fc > 0 is a given physical quantity called the spring constant. Applying 
Newton’s law (force = mass x acceleration), we obtain 

~ky(t) = my n {t), 

where we use the notation y n to denote the second derivative of y with 
respect to t. With c = this second order ordinary differential 

equation becomes 

⑴ y"(t) + c 2 y ⑷ = 0. 

The general solution of equation (1) is given by 

y(t) = a cos ct-\-b sin ct , 

where a and b are constants. Clearly, all functions of this form solve 
equation (1), and Exercise 6 outlines a proof that these are the only 
(twice differentiable) solutions of that differential equation. 

In the above expression for y(t), the quantity c is given, but a and b 
can be any real numbers. In order to determine the particular solution 
of the equation, we must impose two initial conditions in view of the 
two unknown constants a and b. For example, if we are given y(0) and 
^/(0), the initial position and velocity of the mass, then the solution of 
the physical problem is unique and given by 

y(t) = y(0) cos ct + ^ ⑼ sin ct . 

c 

One can easily verify that there exist constants A > 0 and G M such 
that 


a cos ct-\- b sin ct = A cos(ct — cp). 

Because of the physical interpretation given above, one calls A = a 2 -\-b 2 
the “amplitude” of the motion, c its “natural frequency ,ip its “phase” 
(uniquely determined up to an integer multiple of 2 丌 ), and 27r/c the 
“period” of the motion. 

The typical graph of the function Acos(ct — (f), illustrated in 
Figure 2, exhibits a wavelike pattern that is obtained from translating 
and stretching (or shrinking) the usual graph of cost. 

We make two observations regarding our examination of simple har¬ 
monic motion. The first is that the mathematical description of the most 
elementary oscillatory system, namely simple harmonic motion, involves 
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Chapter 1. THE GENESIS OF FOURIER ANALYSIS 



Figure 2. The graph of Acos(ct — cp) 


the most basic trigonometric functions cost and sinIt will be impor¬ 
tant in what follows to recall the connection between these functions 
and complex numbers, as given in Euler’s identity e lt = cos t-\- i sin t. 
The second observation is that simple harmonic motion is determined as 
a function of time by two initial conditions, one determining the position, 
and the other the velocity (specified, for example, at time t = 0). This 
property is shared by more general oscillatory systems, as we shall see 
below. 

Standing and traveling waves 

As it turns out, the vibrating string can be viewed in terms of one¬ 
dimensional wave motions. Here we want to describe two kinds of mo¬ 
tions that lend themselves to simple graphic representations. 

• First, we consider standing waves. These are wavelike motions 
described by the graphs y = u{x^ t) developing in time t as shown 
in Figure 3. 

In other words, there is an initial profile y = ^p(x) representing the 
wave at time t = 0, and an amplifying factor 吩 (t), depending on 
so that y = u(x^ t) with 

u(x, t) = (p(xXt). 

The nature of standing waves suggests the mathematical idea of 
“separation of variables,” to which we will return later. 

• A second type of wave motion that is often observed in nature is 
that of a traveling wave. Its description is particularly simple: 
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Figure 3. A standing wave at different moments in time: ^ = 0 and 
t = t Q 


there is an initial profile F(x) so that u(x, t) equals F(x) when 
t = 0. As t evolves, this profile is displaced to the right by ct units, 
where c is a positive constant, namely 

u(x,t) = F{x — ct). 

Graphically, the situation is depicted in Figure 4. 



Figure 4. A traveling wave at two different moments in time: t = 0 and 
t = to 


Since the movement in t is at the rate c, that constant represents the 
velocity of the wave. The function F(x — ct) is a one-dimensional 
traveling wave moving to the right. Similarly, u(x,t) = F(x + ct) 
is a one-dimensional traveling wave moving to the left. 
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Harmonics and superposition of tones 

The final physical observation we want to mention (without going into 
any details now) is one that musicians have been aware of since time 
immemorial. It is the existence of harmonics, or overtones. The pure 
tones are accompanied by combinations of overtones which are primar¬ 
ily responsible for the timbre (or tone color) of the instrument. The idea 
of combination or superposition of tones is implemented mathematically 
by the basic concept of linearity, as we shall see below. 


We now turn our attention to our main problem, that of describing the 
motion of a vibrating string. First, we derive the wave equation, that is, 
the partial differential equation that governs the motion of the string. 


1.1 Derivation of the wave equation 

Imagine a homogeneous string placed in the (x, ?/)-plane, and stretched 
along the x-axis between x = 0 and x = L. If it is set to vibrate, its 
displacement y = u(x,t) is then a function of x and t, and the goal is to 
derive the differential equation which governs this function. 

For this purpose, we consider the string as being subdivided into a 
large number N of masses (which we think of as individual particles) 
distributed uniformly along the x-axis, so that the n th particle has its 
x-coordinate at x n = nL/N. We shall therefore conceive of the vibrat¬ 
ing string as a complex system of N particles, each oscillating in the 
vertical direction only. ， however, unlike the simple harmonic oscillator we 
considered previously, each particle will have its oscillation linked to its 
immediate neighbor by the tension of the string. 




v 

h 


^n+l 


Figure 5. A vibrating string as a discrete system of masses 
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We then set y n [t) = u(x n ， t), and note that x n +i — x n = h, with h = 
L/N. If we assume that the string has constant density p > 0, it is 
reasonable to assign mass equal to ph to each particle. By Newton’s law, 
phy^(t) equals the force acting on the n th particle. We now make the 
simple assumption that this force is due to the effect of the two nearby 
particles, the ones with ^-coordinates at x n -\ and x n +i (see Figure 5). 
We further assume that the force (or tension) coming from the right of 
the n th particle is proportional to {y n +i — Vn)/h^ where h is the distance 
between x n +i and x n ; hence we can write the tension as 

( i ) … n+1 - %)， 

where t > 0 is a constant equal to the coefficient of tension of the string. 
There is a similar force coming from the left, and it is 

( 5 ) {Vn-l — Dn). 

Altogether, adding these forces gives us the desired relation between the 
oscillators y n {t), namely 

(2) phy’ 二 (t) = I {y n +i(t) + y n -i(t) - 2y n (t)}. 

On the one hand, with the notation chosen above, we see that 

y n +i(t) + 2 / n _i ⑴- 2y n (t) = u(x n + h,t) u(x n - h,t) - 2u(x n ,t). 

On the other hand, for any reasonable function F(x) (that is, one that 
has continuous second derivatives) we have 

F(X + h) + F( h X 2 ~ k) ~ 2F{X) ^ F"{x) asft^O. 

Thus we may conclude, after dividing by h in (2) and letting h tend to 
zero (that is, N goes to infinity), that 

d 2 u _ d 2 u 
卩 dt 2 T dx 2 ? 
or 

1 d 2 u d 2 u 
c 2 dt 2 dx 2 

This relation is known as the one-dimensional wave equation, or 
more simply as the wave equation. For reasons that will be apparent 
later, the coefficient c > 0 is called the velocity of the motion. 


with c = 
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In connection with this partial differential equation, we make an im¬ 
portant simplifying mathematical remark. This has to do with scaling, 
or in the language of physics, a “change of units. 55 That is, we can think of 
the coordinate x as x = aX where a is an appropriate positive constant. 
Now, in terms of the new coordinate X, the interval 0 < x < L becomes 
0 < X < L/a. Similarly, we can replace the time coordinate t by t = bT, 
where b is another positive constant. If we set U(X,T) = u(x, t), then 

dU du d 2 U 2 d 2 u 

dX a dx’ dX 2 a dx 2 ’ 

and similarly for the derivatives in t. So if we choose a and b appropri¬ 
ately, we can transform the one-dimensional wave equation into 


d 2 U _ d 2 U 
dT^ = 


which has the effect of setting the velocity c equal to 1. Moreover, we have 
the freedom to transform the interval 0<x<Lto0<X<7r. (We shall 
see that the choice of 丌 is convenient in many circumstances.) All this 
is accomplished by taking a = L 卜 and b = L/ (c7r). Once we solve the 
new equation, we can of course return to the original equation by making 
the inverse change of variables. Hence, we do not sacrifice generality by 
thinking of the wave equation as given on the interval [0, n] with velocity 
c = 1. 


1.2 Solution to the wave equation 

Having derived the equation for the vibrating string, we now explain two 
methods to solve it: 

• using traveling waves, 

• using the superposition of standing waves. 

While the first approach is very simple and elegant, it does not directly 
give full insight into the problem; the second method accomplishes that, 
and moreover is of wide applicability. It was first believed that the second 
method applied only in the simple cases where the initial position and 
velocity of the string were themselves given as a superposition of standing 
waves. However, as a consequence of Fourier’s ideas, it became clear that 
the problem could be worked either way for all initial conditions. 
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Traveling waves 


To simplify matters as before, we assume that c = 1 and L = n, so that 
the equation we wish to solve becomes 


d 2 u d 2 u 
dx^ 


on 0 < x < 7r. 


The crucial observation is the following: if F is any twice differentiable 
function, then u(x, t) = F(x + t) and u(x^t) = F(x — t) solve the wave 
equation. The verification of this is a simple exercise in differentiation. 
Note that the graph of u(x,t) = F(x — t) at time t = 0 is simply the 
graph of F, and that at time t = 1 it becomes the graph of F translated 
to the right by 1. Therefore, we recognize that F(x — t) is a traveling 
wave which travels to the right with speed 1. Similarly, u(x, t) = F(x + t) 
is a wave traveling to the left with speed 1. These motions are depicted 
in Figure 6. 



Our discussion of tones and their combinations leads us to observe 
that the wave equation is linear. This means that if u(x^ t) and v(x, t) 
are particular solutions, then so is au(x,t) + /3v(x,t), where a and j3 
are any constants. Therefore, we may superpose two waves traveling in 
opposite directions to find that whenever F and G are twice differentiable 
functions, then 

u(x,t) = F(x + t) + G(x — t) 

is a solution of the wave equation. In fact, we now show that all solutions 
take this form. 

We drop for the moment the assumption that 0 < x < 7r, and suppose 
that u is a, twice differentiable function which solves the wave equation 
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for all real x and t. Consider the following new set of variables ^ = x -\-t, 
7] = x — t, and define = u(x, t). The change of variables formula 

shows that v satisfies 

d 2 v 

d^drj _ 

Integrating this relation twice gives v(^r]) = F(^) + G(ry), which then 
implies 


u(x,t) = F(x + t) + G(x — t), 


for some functions F and G. 

We must now connect this result with our original problem, that is, 
the physical motion of a string. There, we imposed the restrictions 0 < 
x < 7r, the initial shape of the string ti(x,0) = /(x), and also the fact 
that the string has fixed end points, namely ?/(0, t) = u(n, t) = 0 for all 
t. To use the simple observation above, we first extend / to all of M by 
making it odd 1 on [—7r, 7r], and then periodic 2 in x of period 2 丌 , and 
similarly for u(x, t), the solution of our problem. Then the extension u 
solves the wave equation on all of R, and w(x, 0) = f(x) for all x G IR. 
Therefore, u(x,t) = F(x +1) + G(x — t), and setting t = 0 we find that 

F(x) + G(x) = f(x). 

Since many choices of F and G will satisfy this identity, this suggests 
imposing another initial condition on u (similar to the two initial condi¬ 
tions in the case of simple harmonic motion), namely the initial velocity 
of the string which we denote by g(x): 

du 

瓦 (x,0) =g{x), 

where of course ^(0) = g(7r) = 0. Again, we extend ^ to M first by mak¬ 
ing it odd over [—7r, 7r], and then periodic of period 2n. The two initial 
conditions of position and velocity now translate into the following sys¬ 
tem: 

/ F(x) + G(x) = f(x ), 

I F'(x) — G'{x) = g(x). 


1 A function / defined on a set U is odd if —x G U whenever x G U and f(—x) = — f (x), 
and even if f(—x) = f(x). 

2 A function / on R is periodic of period uj if f{x -\- u) = f(x) for all x. 
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Differentiating the first equation and adding it to the second, we obtain 


Similarly 


2F’(x) = f\x) + g(x). 

2G’(x) = f(x) - g(x), 


and hence there are constants C\ and C 2 so that 


作 )= 2 


f{x) + / g{y) dy 


+ C t 


and 


G{x) 


fix) - I g(y) dy 


+ C2- 


Since F(x) + G(x) = f(x) we conclude that Ci + C 2 = 0, and therefore, 
our final solution of the wave equation with the given initial conditions 
takes the form 


1 1 r x+t 

u(x, t)^ - [f{x + t) + f(x ~t)} + - / g(y) dy. 
z z Jx-t 

The form of this solution is known as d’Alembert’s formula. Observe 
that the extensions we chose for / and g guarantee that the string always 
has fixed ends, that is, u(0 : t) = u(7r, t) = 0 for all t. 

A final remark is in order. The passage from /： > 0 to t G M, and then 
back to ^ > 0, which was made above, exhibits the time reversal property 
of the wave equation. In other words, a solution u to the wave equation 
for t > 0, leads to a solution u~ defined for negative time t < 0 simply 
by setting u~(x, t) = u(x, —t), a fact which follows from the invariance 
of the wave equation under the transformation 1 1 —> —t. The situation is 
quite different in the case of the heat equation. 

Superposition of standing waves 

We turn to the second method of solving the wave equation, which is 
based on two fundamental conclusions from our previous physical obser¬ 
vations. By our considerations of standing waves, we are led to look for 
special solutions to the wave equation which are of the form 
This procedure, which works equally well in other contexts (in the case 
of the heat equation, for instance), is called separation of variables 
and constructs solutions that are called pure tones. Then by the linearity 
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of the wave equation, we can expect to combine these pure tones into a 
more complex combination of sound. Pushing this idea further, we can 
hope ultimately to express the general solution of the wave equation in 
terms of sums of these particular solutions. 

Note that one side of the wave equation involves only differentiation 
in x, while the other, only differentiation in t. This observation pro¬ 
vides another reason to look for solutions of the equation in the form 
u{x^t) = (^(x)^(t) (that is, to “separate variables”), the hope being to 
reduce a difficult partial differential equation into a system of simpler 
ordinary differential equations. In the case of the wave equation, with u 
of the above form, we get 

and therefore 

^ n (t) 

矽⑷ A x ) ’ 

The key observation here is that the left-hand side depends only on t, 
and the right-hand side only on x. This can happen only if both sides 
are equal to a constant, say A. Therefore, the wave equation reduces to 
the following 

( n \ / 嗲 〃⑴—从⑷二 0 

[ J \ ^"(x) - Acp(x) - 0. 

We focus our attention on the first equation in the above system. At 
this point, the reader will recognize the equation we obtained in the 
study of simple harmonic motion. Note that we need to consider only 
the case when A < 0, since when A > 0 the solution ^ will not oscillate 
as time varies. Therefore, we may write A = —m 2 , and the solution of 
the equation is then given by 

^(t) = A cos mt + B sin mt. 

Similarly, we find that the solution of the second equation in (3) is 
^p(x) = A cos mx + B sin mx. 

Now we take into account that the string is attached at a: = 0 and x = tt. 
This translates into ^(0) = p(7r) = 0, which in turn gives A = 0, and 
if S ^ 0, then m must be an integer. If m = 0, the solution vanishes 
identically, and if m < —1, we may rename the constants and reduce to 
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the case m > 1 since the function sin y is odd and cosy is even. Finally, 
we arrive at the guess that for each m > 1, the function 

u m (x ， t) = (A m cos mt + B m sin mt) sin mx, 

which we recognize as a standing wave, is a solution to the wave equa¬ 
tion. Note that in the above argument we divided by if and 也 which 
sometimes vanish, so one must actually check by hand that the standing 
wave Urn solves the equation. This straightforward calculation is left as 
an exercise to the reader. 

Before proceeding further with the analysis of the wave equation, we 
pause to discuss standing waves in more detail. The terminology comes 
from looking at the graph of Um(x, t) for each fixed t. Suppose first that 
m = 1, and take u(x,t) = cost sin x. Then, Figure 7 (a) gives the graph 
of u for different values of t. 



Figure 7. Fundamental tone (a) and overtones (b) at different moments 
in time 


The case m = 1 corresponds to the fundamental tone or first har¬ 
monic of the vibrating string. 

We now take m = 2 and look at u(x,t) = cos 2 亡 sin 2$. This corre¬ 
sponds to the first overtone or second harmonic, and this motion is 
described in Figure 7 (b). Note that u(jr/2^ t) = 0 for all t. Such points, 
which remain motionless in time, are called nodes, while points whose 
motion has maximum amplitude are named ant i-nodes. 

For higher values of m we get more overtones or higher harmonics. 
Note that as m increases, the frequency increases, and the period 2^/m 
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decreases. Therefore, the fundamental tone has a lower frequency than 
the overtones. 

We now return to the original problem. Recall that the wave equation 
is linear in the sense that if u and v solve the equation, so does au + /3v 
for any constants a and /3. This allows us to construct more solutions 
by taking linear combinations of the standing waves u m . This technique, 
called superposition, leads to our final guess for a solution of the wave 
equation 

oo 

(4) u{x,t )= E (Am cos mt + Bm sin mi) sin mx. 

m=l 

Note that the above sum is infinite, so that questions of convergence 
arise, but since most of our arguments so far are formal, we will not 
worry about this point now. 

Suppose the above expression gave all the solutions to the wave equa¬ 
tion. If we then require that the initial position of the string at time 
t = 0 is given by the shape of the graph of the function / on [0,7r], with 
of course /(0) = /(7r) = 0, we would have u(x,0) = /(x), hence 

oo 

Ajn sin mx = f(x). 

m=l 

Since the initial shape of the string can be any reasonable function /, we 
must ask the following basic question: 

Given a function / on [0,7r] (with /(0) = /(n) = 0), can we 
find coefficients Am so that 


⑸ f( x ) 二 Am sin mx ? 

m=l 

This question is stated loosely, but a lot of our effort in the next two 
chapters of this book will be to formulate the question precisely and 
attempt to answer it. This was the basic problem that initiated the 
study of Fourier analysis. 

A simple observation allows us to guess a formula giving if the 
expansion (5) were to hold. Indeed, we multiply both sides by sin nx 
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and integrate between [0,7r]; working formally, we obtain 


f(x) sin nx dx = / ^ A m smmx)smnxdx 


smmx sinnxdx^A n 


7T 


where we have used the fact that 



sin mx sin nx dx = 


0 

tt/2 


if m 7 ^ n, 
if m = n. 


Therefore, the guess for A n , called the n th Fourier sine coefficient of /, 
is 


⑹ 


^-n 


2 

丌 . 


f(x) sin nx dx. 


We shall return to this formula, and other similar ones, later. 


One can transform the question about Fourier sine series on [0, tt] to 
a more general question on the interval [—7r,7r]. If we could express / 
on [0,7r] in terms of a sine series, then this expansion would also hold on 
[—7T, 7r] if we extend / to this interval by making it odd. Similarly, one 
can ask if an even function g(x) on [—7r, tt] can be expressed as a cosine 
series, namely 


oo 

9 ( x ) = A m cosmx - 

m=0 


More generally, since an arbitrary function F on [—7r, tt] can be expressed 
as f g, where / is odd and g is even, 3 we may ask if F can be written 
as 


F{^) 


OO 

m=l 


sin mx 


oo 

+ Ea 

m=0 


cos mx, 


or by applying Euler’s identity e lx = cos x -\-i sinx, we could hope that 
F takes the form 

oo 

F(x)^ J2 a ^ eimX - 

m=—oo 


3 Take, for example, f(x) = [F(x) — F(—x)]/2 and g{x) = [F(x) + F(—x)]/2. 
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By analogy with (6), we can use the fact that 



dx = 


0 if n 7 ^ m 
1 if n = m, 


to see that one expects that 



F(x)e~ inx dx. 


The quantity a n is called the n th Fourier coefficient of F. 

We can now reformulate the problem raised above: 

Question: Given any reasonable function F on [_ 丌，丌 ], with 
Fourier coefficients defined above, is it true that 


F(x)^ Y, W? 


⑺ 


This formulation of the problem, in terms of complex exponentials, is 
the form we shall use the most in what follows. 

Joseph Fourier (1768-1830) was the first to believe that an “arbitrary” 
function F could be given as a series ⑺. In other words, his idea was 
that any function is the linear combination (possibly infinite) of the most 
basic trigonometric functions sin mx and cosmx, where m ranges over 
the integers. 4 Although this idea was implicit in earlier work, Fourier had 
the conviction that his predecessors lacked, and he used it in his study 
of heat diffusion; this began the subject of “Fourier analysis.” This 
discipline, which was first developed to solve certain physical problems, 
has proved to have many applications in mathematics and other fields as 
well, as we shall see later. 

We return to the wave equation. To formulate the problem correctly, 
we must impose two initial conditions, as our experience with simple 
harmonic motion and traveling waves indicated. The conditions assign 
the initial position and velocity of the string. That is, we require that u 
satisfy the differential equation and the two conditions 


du 

~dt 


u(x^0) = f(x) and 


(x ， 0) = g(x) 


4 The first proof that a general class of functions can be represented by Fourier series 
was given later by Dirichlet; see Problem 6, Chapter 4. 
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where / and g are pre-assigned functions. Note that this is consistent 
with (4) in that this requires that / and g be expressible as 

oo oo 

f ( x ) 二 Am sin mx and g{x) = mB m sin mx. 

m=l m=l 


1.3 Example: the plucked string 

We now apply our reasoning to the particular problem of the plucked 
string. For simplicity we choose units so that the string is taken on the 
interval [0,7r], and it satisfies the wave equation with c = 1. The string is 
assumed to be plucked to height h at the point p with 0 < ^ < 7r; this is 
the initial position. That is, we take as our initial position the triangular 
shape given by 


xh 

P 

f(x )= 

h(7T — x) 

, 丌 


fOT 0 < X < p 

fOT p < X < 7T, 


which is depicted in Figure 8. 



Figure 8. Initial position of a plucked string 


We also choose an initial velocity g(x) identically equal to 0. Then, we 
can compute the Fourier coefficients of / (Exercise 9), and assuming that 
the answer to the question raised before (5) is positive, we obtain 


f ( x ) 二 Arn sin mx with 

m=l 


2h sin mp 
m 2 p(n — p) 
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Thus 

oo 

(8) u(x,t)=Am cos mt sin mx, 

771=1 


and note that this series converges absolutely. The solution can also be 
expressed in terms of traveling waves. In fact 


⑼ 


u(x,t)= 


f(x + 0 + f(x - t) 
2 


Here f(x) is defined for all x as follows: first, / is extended to [—7r,7r] by 
making it odd, and then / is extended to the whole real line by making 
it periodic of period 27r, that is, f(x + 2nk) = f(x) for all integers k. 
Observe that (8) implies (9) in view of the trigonometric identity 


cos u sin = 


2 


[sin(?x + i；) + sin(u — u)]. 


As a final remark, we should note an unsatisfactory aspect of the so¬ 
lution to this problem, which however is in the nature of things. Since 
the initial data f(x) for the plucked string is not twice continuously dif¬ 
ferentiable, neither is the function u (given by (9)). Hence u is not truly 
a solution of the wave equation: while u(x, t) does represent the position 
of the plucked string, it does not satisfy the partial differential equation 
we set out to solve! This state of affairs may be understood properly 
only if we realize that u does solve the equation, but in an appropriate 
generalized sense. A better understanding of this phenomenon requires 
ideas relevant to the study of “weak solutions” and the theory of “dis¬ 
tributions.” These topics we consider only later, in Books III and IV. 


2 The heat equation 

We now discuss the problem of heat diffusion by following the same 
framework as for the wave equation. First, we derive the time-dependent 
heat equation, and then study the steady-state heat equation in the disc, 
which leads us back to the basic question (7). 

2.1 Derivation of the heat equation 

Consider an infinite metal plate which we model as the plane R 2 , and 
suppose we are given an initial heat distribution at time t = 0. Let the 
temperature at the point (x,y) at time t be denoted by u(x,y,t). 









Ibookroot October 20, 2007 


2. The heat equation 19 

Consider a small square centered at (xo,yo) with sides parallel to the 
axis and of side length /i, as shown in Figure 9. The amount of heat 
energy in S at time t is given by 

H(t) = a JJ u(x, y ， t) dx dy , 

where a > 0 is a constant called the specific heat of the material. There¬ 
fore, the heat flow into S is 

dH ffdu 

-m- a Jj s dt dxd ^ 

which is approximately equal to 

ah 2 ^{x 0 ,yo,t), 

since the area of S is h 2 . Now we apply Newton’s law of cooling, which 
states that heat flows from the higher to lower temperature at a rate 
proportional to the difference, that is, the gradient. 

h 






h < 


+ 

- > 



{xo,yo) 

(x 0 + h/2, y 0 ) 






Figure 9. Heat flow through a small square 


The heat flow through the vertical side on the right is therefore 

—i^h + h/2^ yo, t ), 

where k, > 0 is the conductivity of the material. A similar argument for 
the other sides shows that the total heat flow through the square S is 
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given by 


Kh 


Ou 

&/2, Vo, t) _ — "/2, 2 / 0 , 亡 ) 

+ 瓦 ( 吻 ， "o + W)- 瓦 (xo，yo - /i/2, t) 


Applying the mean value theorem and letting h tend to zero, we find 
that 

g du d 2 u d 2 u 
k, dt dx 2 dy 2 5 

this is called the time-dependent heat equation, often abbreviated 
to the heat equation. 


2.2 Steady-state heat equation in the disc 

After a long period of time, there is no more heat exchange, so that 
the system reaches thermal equilibrium and du/dt = 0. In this case, 
the time-dependent heat equation reduces to the steady-state heat 
equation 


( 10 ) 


d 2 u d 2 u 
dx 2 dy 2 


The operator d 2 /dx 2 + d 2 /dy 2 is of such importance in mathematics and 
physics that it is often abbreviated as A and given a name: the Laplace 
operator or Laplacian. So the steady-state heat equation is written as 


Au = 0, 


and solutions to this equation are called harmonic functions. 
Consider the unit disc in the plane 

D = {{x, y) eM 2 :x 2 + y 2 < 1}, 

whose boundary is the unit circle C. In polar coordinates (r, 0), with 
0 < r and 0 < 0 < 27T, we have 

D = {(r, 0) : 0 < r < 1} and C = {(r, 6) : r = 1}. 


The problem, often called the Dirichlet problem (for the Laplacian 
on the unit disc), is to solve the steady-state heat equation in the unit 
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disc subject to the boundary condition u = f on C. This corresponds to 
fixing a predetermined temperature distribution on the circle, waiting a 
long time, and then looking at the temperature distribution inside the 
disc. 



Figure 10. The Dirichlet problem for the disc 


While the method of separation of variables will turn out to be useful 
for equation (10), a difficulty comes from the fact that the boundary 
condition is not easily expressed in terms of rectangular coordinates. 
Since this boundary condition is best described by the coordinates (r, 6 ), 
namely u(l,0) = /(0), we rewrite the Laplacian in polar coordinates. An 
application of the chain rule gives (Exercise 10): 

_ d 2 u 1 du 1 d 2 u 

U dr 2 + r dr + r 2 d0 2 

We now multiply both sides by r 2 , and since Au = 0, we get 

2 d 2 u du d 2 u 

r - -I- r - = - 

dr 2 dr d9 2 

Separating these variables, and looking for a solution of the form 
u(r ， 0) = F(r)G{6)^ we find 

r 2 F"(r)+rF’(r) _ G"(6>) 

w) = 'W 
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Since the two sides depend on different variables, they must both be 
constant, say equal to A. We therefore get the following equations: 

f G" ⑹ + AG ⑼ = 0, 

\ r 2 F n {r) + rF’(r) — XF(r) = 0. 

Since G must be periodic of period 2 丌 , this implies that A > 0 and (as 
we have seen before) that A = m 2 where m is an integer; hence 

G{9) = A cos mO + B sin mO. 

An application of Euler’s identity, = cos a: + i sin allows one to 
rewrite G in terms of complex exponentials, 

G{0) = Ae i7n0 + Be~ ime . 

With A = m 2 and m 乂 0, two simple solutions of the equation in F are 
F(r) = r 171 and F(r) = r _m (Exercise 11 gives further information about 
these solutions). If m = 0, then F(r) = 1 and F{r) = logr are two solu¬ 
tions. If m > 0, we note that r_ m grows unboundedly large as r tends 
to zero, so F{r)G{6) is unbounded at the origin; the same occurs when 
m = 0 and F(r) = logr. We reject these solutions as contrary to our 
intuition. Therefore, we are left with the following special functions: 

u m (r,e)^r^e ime , m G Z. 

We now make the important observation that (10) is linear, and so as 
in the case of the vibrating string, we may superpose the above special 
solutions to obtain the presumed general solution: 

oo 

u(r,e)^ J2 a m r^e ime . 

m=—oo 

If this expression gave all the solutions to the steady-state heat equation, 
then for a reasonable / we should have 

oo 

u(l,9)^ J2 a m e im0 = /(0). 

m=—oo 

We therefore ask again in this context: given any reasonable function / 
on [0, 2n] with /(0) = /(27r), can we find coefficients a m so that 

m 二 a ^ eime ? 


m=—oo 
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Historical Note: D’Alembert (in 1747) first solved the equation of the 
vibrating string using the method of traveling waves. This solution was 
elaborated by Euler a year later. In 1753, D. Bernoulli proposed the 
solution which for all intents and purposes is the Fourier series given 
by (4), but Euler was not entirely convinced of its full generality, since 
this could hold only if an “arbitrary” function could be expanded in 
Fourier series. D 5 Alembert and other mathematicians also had doubts. 
This viewpoint was changed by Fourier (in 1807) in his study of the 
heat equation, where his conviction and work eventually led others to a 
complete proof that a general function could be represented as a Fourier 
series. 

3 Exercises 

1. li z — x iy d, complex number with x, G M, we define 

kl = (x 2 y 2 ) 1/2 

and call this quantity the modulus or absolute value of 2 ：. 

(a) What is the geometric interpretation of \z\l 

(b) Show that if \z\ = 0, then z = 0. 

(c) Show that if A G M, then \Xz\ = |A||z|, where |A| denotes the standard 
absolute value of a real number. 


(d) If zi and are two complex numbers, prove that 


\z\Z 2 \ = \zi\\z 2 \ and |^i + ^| < \zi\ + \z 2 \. 


(e) Show that if z / 0, then \l/z\ = l/\z\. 


2. If z — x -\-iy is a complex number with x, 2 / G M, we define the complex 
conjugate of 2 : by 


~z — x — iy. 

(a) What is the geometric interpretation of z? 

(b) Show that \z\ 2 = zz. 


(c) Prove that if 2 ： belongs to the unit circle, then \ jz — ~z. 
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3. A sequence of complex numbers {^ n }^=i is said to converge if there exists 
w £ C such that 


lim \w n — i(；| = 0, 

n ― yoo 

and we say that w is a, limit of the sequence. 

(a) Show that a converging sequence of complex numbers has a unique limit. 

The sequence {tt ； n}^=i is said to be a Cauchy sequence if for every e 〉 0 there 
exists a positive integer N such that 

\w n — w m \ < e whenever n,m > N. 

(b) Prove that a sequence of complex numbers converges if and only if it is a 
Cauchy sequence. [Hint: A similar theorem exists for the convergence of a 
sequence of real numbers. Why does it carry over to sequences of complex 
numbers?] 

A series z n of complex numbers is said to converge if the sequence formed 

by the partial sums 

N 

Sn = ^ z n 

n=l 

converges. Let {a n }^ =1 be a sequence of non-negative real numbers such that 
the series a n converges. 

(c) Show that if {zn}^ = i is a sequence of complex numbers satisfying 
\z n \ < a n for all n, then the series 'Yh n z n converges. [Hint: Use the Cauchy 
criterion.] 


4. For 2 ： G C, we define the complex exponential by 


e z 



(a) Prove that the above definition makes sense, by showing that the series 
converges for every complex number 2 :. Moreover, show that the conver¬ 
gence is uniform 5 on every bounded subset of C. 

(b) If zi,Z 2 are two complex numbers, prove that e Zl e Z2 = e Zl+Z2 . [Hint: Use 
the binomial theorem to expand + ^) 71 , as well as the formula for the 
binomial coefficients.] 


5 A sequence of functions {fn(z)}^ =1 is said to be uniformly convergent on a set S if 
there exists a function f on S so that for every e > 0 there is an integer N such that 
\fn{z) — /(z)| < e whenever n > N and z G S. 
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(c) Show that if 2 ： is purely imaginary, that is, z — iy with y G M, then 

e iy = cos y + i sin y. 

This is Euler’s identity. [Hint: Use power series.] 

(d) More generally, 

e x+zy _ e x ( cos y i siny) 
whenever x，y e M, and show that 

|e x 构 I = e x . 

(e) Prove that e z = 1 if and only if z = 2jrki for some integer k. 

(f) Show that every complex number z — x -\-iy can be written in the form 

2 : = re l9 , 

where r is unique and in the range 0 < r < oo, and ^ G M is unique up to 
an integer multiple of 2n. Check that 

r = \z\ and 0 = arctan(y/a:) 

whenever these formulas make sense. 

(g) In particular, % — e l7r 〆 2 . What is the geometric meaning of multiplying a 
complex number by z? Or by e %e for any 6 G M? 

(h) Given ^ G M, show that 

e i0 + e -ie e ie _ e -ie 

cos 6 = - and sin 沒 = - . 

2 2i 

These are also called Euler’s identities. 

(i) Use the complex exponential to derive trigonometric identities such as 

cos (沒 + i?) = cos 沒 cos i? — sin 沒 sin 办， 
and then show that 

2 sin Osm^p — cos{6 — (p) — cos(0 + (p ), 

2 sin 沒 cos (p = sin (沒 + + sin(^ — ip). 

This calculation connects the solution given by d’Alembert in terms of 
traveling waves and the solution in terms of superposition of standing 


waves. 












Ibookroot October 20, 2007 


1 if n = 0, 
0 if n ^ 0. 


26 Chapter 1. THE GENESIS OF FOURIER ANALYSIS 

5. Verify that f(x) = e inx is periodic with period and that 

— r e inx dx : 

Use this fact to prove that if n, m > 1 we have 

i r , / < 

—/ cos nx cos mx dx — < 

^ J-n { 

and similarly 


if n ^ m, 
n — m. 


sin nx sin mx dx - 


0 if n ^ m, 


Finally, show that 


sin nx cos mx dx = 0 for any n, m. 


[Hint: Calculate e inx e- irnx + e inx e imx and e inx e~ irnx - 


6. Prove that if / is a twice continuously differentiable function on M which is 
a solution of the equation 

/" ⑴ + c 2 / ⑷= 0， 

then there exist constants a and b such that 

f(t) — a cos ct + b sin ct. 

This can be done by differentiating the two functions g(t) — f(t) cos ct — c 一 1 f’ (t) sinct 
and h(t ) 二 f(t) sm ct + c~ 1 f , (t) cos ct. 


7. Show that if a and b are real, then one can write 


a cos ct-\-b sin ct — A cos(ct — (p), 


where A = \/a 2 + 6 2 , and is chosen so that 


cos (f 


y/a 2 b 2 


and sin (p 


\Ja 2 -\-b 2 


8. Suppose F is a function on (a, b) with two continuous derivatives. Show that 
whenever x and x + h belong to (a, 6), one may write 

h 2 

F(x + ") = F(x) + hF\x) + y F"(x) + "V ⑻， 
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where (p[h) —^ 0 as /i —>• 0. 
Deduce that 


F(x + /i) + F(x — h) — 2F(x) 

h? 


F n {x) as h — 0. 


[Hint: This is simply a Taylor expansion. It may be obtained by noting that 


F(x + h)- F(x) 


F\y)dy, 


and then writing F'[y) — F'(x) + {y — x)F n {x) + (y — — x), where ^(h) 

0 as h 0.] 


9. In the case of the plucked string, use the formula for the Fourier sine coeffi¬ 
cients to show that 

2h sin mp 
m m 2 p(tt — p)' 

For what position of p are the second, fourth, ... harmonics missing? For what 
position of p are the third, sixth, ... harmonics missing? 


10 . Show that the expression of the Laplacian 

A d 2 d 2 

△ —- + - 

dx 2 dy 2 

is given in polar coordinates by the formula 

A _ d 2 Id 1 d 2 

dr 2 r dr r 2 d6 2 

Also, prove that 


du 

2 

i 

du 

2 

du 

2 i 

1 

du 

dx 

十 

dy 


dr 


06 


11. Show that if n G Z the only solutions of the differential equation 
r 2 F n {r) + rF\r) - n 2 F(r) = 0, 

which are twice differentiable when r > 0, are given by linear combinations of 
r n and r_ n when n 0, and 1 and logr when n = 0. 

[Hint: If F solves the equation, write F{r) — g(r)r n , find the equation satisfied 
by g, and conclude that rg'(r) + 2ng(r) — c where c is a constant.] 
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^ = /o 7T 

Figure 11. Dirichlet problem in a rectangle 


4 Problem 


1. Consider the Dirichlet problem illustrated in Figure 11. 

More precisely, we look for a solution of the steady-state heat equation 
Au = 0 in the rectangle R = {(x, y) : 0 < x < 7r, 0 < ? / < 1} that vanishes on 
the vertical sides of R : and so that 


^(^,0) = f 0 (x) and u{x, 1) = fi(x ) ， 

where /o and /i are initial data which fix the temperature distribution on the 
horizontal sides of the rectangle. 

Use separation of variables to show that if /o and /i have Fourier expansions 


fo( x ) — ^-k sin kx and fi(x) = sin kx, 


then 


fc=l 


- S inh fc (l-,)^ + sinh^ \ g ， n ^ 
smh A: smh/c / 


We recall the definitions of the hyperbolic sine and cosine functions: 


sinhx 


e x -e- 


and cosh a: 


e x + e~ 
2 ~ 


Compare this result with the solution of the Dirichlet problem in the strip ob¬ 
tained in Problem 3, Chapter 5. 
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Basic Properties of Fourier 
Series 


Nearly fifty years had passed without any progress on 
the question of analytic representation of an arbitrary 
function, when an assertion of Fourier threw new light 
on the subject. Thus a new era began for the de¬ 
velopment of this part of Mathematics and this was 
heralded in a stunning way by major developments in 
mathematical Physics. 

B. Riemann, 1854 


In this chapter, we begin our rigorous study of Fourier series. We set 
the stage by introducing the main objects in the subject, and then for¬ 
mulate some basic problems which we have already touched upon earlier. 

Our first result disposes of the question of uniqueness: Are two func¬ 
tions with the same Fourier coefficients necessarily equal? Indeed, a 
simple argument shows that if both functions are continuous, then in 
fact they must agree. 

Next, we take a closer look at the partial sums of a Fourier series. Using 
the formula for the Fourier coefficients (which involves an integration), 
we make the key observation that these sums can be written conveniently 
as integrals: 

^ j D N (x-y)f{y) dy, 

where {_D_/v} is a family of functions called the Dirichlet kernels. The 
above expression is the convolution of / with the function Dn. Convo¬ 
lutions will play a critical role in our analysis. In general, given a family 
of functions {if n }, we are led to investigate the limiting properties as n 
tends to infinity of the convolutions 

^ J K n {x - y)f{y)dy. 


We find that if the family {K n } satisfies the three important properties 
of “good kernels,” then the convolutions above tend to f(x) as n ^ oo 
(at least when / is continuous). In this sense, the family {K n } is an 
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“approximation to the identity.Unfortunately, the Dirichlet kernels 
Dn do not belong to the category of good kernels, which indicates that 
the question of convergence of Fourier series is subtle. 

Instead of pursuing at this stage the problem of convergence, we con¬ 
sider various other methods of summing the Fourier series of a function. 
The first method, which involves averages of partial sums, leads to con¬ 
volutions with good kernels, and yields an important theorem of Fejer. 
From this, we deduce the fact that a continuous function on the circle 
can be approximated uniformly by trigonometric polynomials. Second, 
we may also sum the Fourier series in the sense of Abel and again en¬ 
counter a family of good kernels. In this case, the results about convo¬ 
lutions and good kernels lead to a solution of the Dirichlet problem for 
the steady-state heat equation in the disc, considered at the end of the 
previous chapter. 


1 Examples and formulation of the problem 

We commence with a brief description of the types of functions with 
which we shall be concerned. Since the Fourier coefficients of / are 
defined by 



where / is complex-valued on [0, L], it will be necessary to place some in- 
tegrability conditions on /. We shall therefore assume for the remainder 
of this book that all functions are at least Riemann integrable. 1 Some¬ 
times it will be illuminating to focus our attention on functions that 
are more “regular,” that is, functions that possess certain continuity or 
differentiability properties. Below, we list several classes of functions in 
increasing order of generality. We emphasize that we will not generally 
restrict our attention to real-valued functions, contrary to what the fol¬ 
lowing pictures may suggest; we will almost always allow functions that 
take values in the complex numbers C. Furthermore, we sometimes think 
of our functions as being defined on the circle rather than an interval. 
We elaborate upon this below. 


1 Limiting ourselves to Riemann integrable functions is natural at this elementary stage 
of study of the subject. The more advanced notion of Lebesgue integrability will be taken 
up in Book III. 
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Everywhere continuous functions 

These are the complex-valued functions / which are continuous at every 
point of the segment [0,L]. A typical continuous function is sketched in 
Figure 1 (a). We shall note later that continuous functions on the circle 
satisfy the additional condition /(0) = f(L). 

Piecewise continuous functions 

These are bounded functions on [0, L] which have only finitely many 
discontinuities. An example of such a function with simple discontinuities 
is pictured in Figure 1 (b). 


y A y 



(a) (b) 

Figure 1. Functions on [0, L]: continuous and piecewise continuous 


This class of functions is wide enough to illustrate many of the the¬ 
orems in the next few chapters. However, for logical completeness we 
consider also the more general class of Riemann integrable functions. 
This more extended setting is natural since the formula for the Fourier 
coefficients involves integration. 

Riemann integrable functions 

This is the most general class of functions we will be concerned with. 
Such functions are bounded, but may have infinitely many discontinu¬ 
ities. We recall the definition of integrability. A real-valued function / 
defined on [0, L] is Riemann integrable (which we abbreviate as in¬ 
tegrable 2 ) if it is bounded, and if for every e > 0, there is a subdivision 
0 = < ... < x N-i < xn = L of the interval [0, L], so that if U 


2 Starting in Book III, the term “integrable” will be used in the broader sense of 
Lebesgue theory. 
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and C are, respectively, the upper and lower sums of / for this subdivi¬ 
sion, namely 


N 

sup f{x)\{Xj - Xj-i) 

j=l Xj^KxKxj 

and 

N 

£ H f( x )\( x i - 巧 -i ) ， 

j=l 一一 

then we have U — C < e. Finally, we say that a complex-valued function 
is integrable if its real and imaginary parts are integrable. It is worthwhile 
to remember at this point that the sum and product of two integrable 
functions are integrable. 

A simple example of an integrable function on [0,1] with infinitely 
many discontinuities is given by 

{ 1 if l/(n + 1) < a; < 1/n and n is odd, 

0 if l/(n + 1) < x < 1/n and n is even, 

0 if x = 0. 

This example is illustrated in Figure 2. Note that / is discontinuous 
when x = 1/n and at x = 0. 



2 




Figure 2. A Riemann integrable function 


More elaborate examples of integrable functions whose discontinuities 
are dense in the interval [0,1] are described in Problem 1. In general, 
while integrable functions may have infinitely many discontinuities, these 
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functions are actually characterized by the fact that, in a precise sense, 
their discontinuities are not too numerous: they are “negligible,” that is, 
the set of points where an integrable function is discontinuous has u mea- 
sure 0.” The reader will find further details about Riemann integration 
in the appendix. 

From now on, we shall always assume that our functions are integrable, 
even if we do not state this requirement explicitly. 


Functions on the circle 

There is a natural connection between 27r-periodic functions on M like the 
exponentials e m0 , functions on an interval of length 2 丌 ， and functions on 
the unit circle. This connection arises as follows. 

A point on the unit circle takes the form e 20 , where 0 is a real number 
that is unique up to integer multiples of 2tt. If F is a function on the 
circle, then we may define for each real number 9 

m = F{e ie ), 


and observe that with this definition, the function / is periodic on M of 
period 27r, that is, f(6 + 2tt) = f(0) for all 6. The integrability, continu¬ 
ity and other smoothness properties of F are determined by those of /. 
For instance, we say that F is integrable on the circle if / is integrable 
on every interval of length 2n. Also, F is continuous on the circle if / 
is continuous on ]R, which is the same as saying that / is continuous on 
any interval of length 2tt. Moreover, F is continuously differentiable if / 
has a continuous derivative, and so forth. 

Since / has period 2 丌 ， we may restrict it to any interval of length 2 丌， 
say [0, 2tt] or [—7r, 7r], and still capture the initial function F on the circle. 
We note that / must take the same value at the end-points of the interval 
since they correspond to the same point on the circle. Conversely, any 
function on [0,2 丌 ] for which /(0) = /(27r) can be extended to a periodic 
function on K. which can then be identified as a function on the circle. 
In particular, a continuous function / on the interval [0, 2 丌 ] gives rise to 
a continuous function on the circle if and only if /(0) = /(27r). 

In conclusion, functions on R that 27r-periodic, and functions on an 
interval of length 2n that take on the same value at its end-points, are 
two equivalent descriptions of the same mathematical objects, namely, 
functions on the circle. 

In this connection, we mention an item of notational usage. When 
our functions are defined on an interval on the line, we often use x as 
the independent variable; however, when we consider these as functions 
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on the circle, we usually replace the variable x by 6. As the reader will 
note, we are not strictly bound by this rule since this practice is mostly 
a matter of convenience. 


1.1 Main definitions and some examples 

We now begin our study of Fourier analysis with the precise definition of 
the Fourier series of a function. Here, it is important to pin down where 
our function is originally defined. If / is an integrable function given on 
an interval [a, 6] of length L (that is, b — a = L), then the n th Fourier 
coefficient of / is defined by 


f{n) = } 广 f(x)e— 2win 幻 L dx, nel. 

L J a 

The Fourier series of / is given formally 3 by 

f(n)e 27rinx / L . 

n=—oo 

We shall sometimes write a n for the Fourier coefficients of /, and use the 
notation 

oo 

J2 w inx/L 

n=—oo 


to indicate that the series on the right-hand side is the Fourier series of 
/• 


For instance, if / is an integrable function on the interval [—7r, 7r], then 
the n th Fourier coefficient of / is 

f(n) = «n - ^ me~ ind de, nez, 

and the Fourier series of / is 

oo 

m 〜 a ^ ine - 

n=—oo 

Here we use 0 as a variable since we think of it as an angle ranging from 

—7T tO 7T. 


3 At this point, we do not say anything about the convergence of the series. 
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Also, if / is defined on [0,27r], then the formulas are the same as 
above, except that we integrate from 0 to 2n in the definition of the 
Fourier coefficients. 

We may also consider the Fourier coefficients and Fourier series for a 
function defined on the circle. By our previous discussion, we may think 
of a function on the circle as a function / on R which is 27r-periodic. 
We may restrict the function / to any interval of length 2 丌 , for instance 
[0,2-7r] or [—7r, 7r], and compute its Fourier coefficients. Fortunately, / is 
periodic and Exercise 1 shows that the resulting integrals are independent 
of the chosen interval. Thus the Fourier coefficients of a function on the 
circle are well defined. 

Finally, we shall sometimes consider a function g given on [0,1]. Then 

疒 1 oo 

g{n) = a n = g(x)e~ 27rinx dx and 〜 ^ a n e 27rinx . 

0 n=—oo 

Here we use x for a variable ranging from 0 to 1. 

Of course, if / is initially given on [0,27r], then g(x) = f(27rx) is defined 
on [0,1] and a change of variables shows that the n th Fourier coefficient 
of / equals the n th Fourier coefficient of g. 

Fourier series are part of a larger family called the trigonometric se¬ 
ries which, by definition, are expressions of the form J]^L-oo c n e 27rinx ^ L 
where c n G C. If a trigonometric series involves only finitely many non¬ 
zero terms, that is, c n = 0 for all large |n|, it is called a trigonometric 
polynomial; its degree is the largest value of |n| for which c n ^ 0. 

The N th partial sum of the Fourier series of /, for N a positive 
integer, is a particular example of a trigonometric polynomial. It is 
given by 

N 

SN(f)(x)= I /(n)e 2 ^/ L . 

n=—N 

Note that by definition, the above sum is symmetric since n ranges from 
—N to TV, a choice that is natural because of the resulting decomposition 
of the Fourier series as sine and cosine series. As a consequence, the 
convergence of Fourier series will be understood (in this book) as the 
“limit” as iV tends to infinity of these symmetric sums. 

In fact, using the partial sums of the Fourier series, we can reformulate 
the basic question raised in Chapter 1 as follows: 

Problem: In what sense does 5^(/) converge to / as TV —>• oo ? 
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Before proceeding further with this question, we turn to some simple 
examples of Fourier series. 


Example 1. Let f(6) = 6 for — 丌 < 0 < n. The calculation of the Fourier 
coefficients requires a simple integration by parts. First, if n # 0, then 


/>) = &/ 0e~ ine d6 


2tt 


-inO 


in 


—inQ 


27rm , 


d6 


(-l ) n+1 


in 


and if n = 0 we clearly have 

/ ⑼ 


2tt , 


9d6 = 0. 


Hence, the Fourier series of / is given by 


m - E 


(—i) 


n+l 


jinQ 


n^O 


in 


2 E(-!) 


n+l 


sin nO 


n 


The first sum is over all non-zero integers, and the second is obtained by 
an application of Euler’s identities. It is possible to prove by elementary 
means that the above series converges for every 0, but it is not obvious 
that it converges to f(0). This will be proved later (Exercises 8 and 9 
deal with a similar situation). 


Example 2. Define f(9) = (tt — 6) 2 /A for 0 < 9 < 2n. Then successive 
integration by parts similar to that performed in the previous example 
yield 




Example 3. The Fourier series of the function 


on [0,2 丌 ] is 


/(0) 二 -^^e i(7r - 0)a 

sin 7ra 


m- E 


^in6 

n-\- a 


n=—oo 
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whenever a is not an integer. 

Example 4. The trigonometric polynomial defined for x G [—7r, 7r] by 


N 

Dn(x) = ^2 e% 
n=—N 


is called the N th Dirichlet kernel and is of fundamental importance in 
the theory (as we shall see later). Notice that its Fourier coefficients a n 
have the property that a n = 1 if \n\ < N and a n = 0 otherwise. A closed 
form formula for the Dirichlet kernel is 


Dn(x)= 


sm((N+l)x) 

sin(a;/2) 


This can be seen by summing the geometric progressions 


N 

uj n and 

n=0 


-1 

E - 

n=—N 


with lj = e tx . These sums are, respectively, equal to 


N+l 


1 — 0； 

1 — LJ 


and 


CJ 


-N 


1 — UJ 


Their sum is then 


LU 


-N 


UJ 


N+1 u~ N ~ 1 / 2 - uj n+1 / 2 sin((iV + \)x) 


1—a; 


UJ 


- 1/2 - ^ 1/2 


sin(a;/2) 


giving the desired result. 


Example 5. The function P r (6、, called the Poisson kernel, is defined 
for 6 G [—7r, 7r] and 0 < r < 1 by the absolutely and uniformly convergent 
series 

oo 

P r (9)^ J2 r^e ind . 

n=—oo 


This function arose implicitly in the solution of the steady-state heat 
equation on the unit disc discussed in Chapter 1. Note that in calcu¬ 
lating the Fourier coefficients of P r {6) we can interchange the order of 
integration and summation since the sum converges uniformly in 6 for 
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each fixed r, and obtain that the n th Fourier coefficient equals r*l n l. One 
can also sum the series for P r {0) and see that 

1 — r 2 

P r (0) = --- . 

1 — 2r cos 6 -\- r 2 

In fact, 

oo oo 

P r (0) + with w = re ie , 

n=0 n=l 

where both series converge absolutely. The first sum (an infinite geomet¬ 
ric progression) equals 1/(1 — a;), and likewise, the second is uJ /(1 — cJ). 
Together, they combine to give 

1 — cj + (1 — cu)uJ 1 — \(ju\ 2 1 — r 2 

(1 — c<;)(l — U) |1 — uj\ 2 1 — 2r cos 9 r 2： 

as claimed. The Poisson kernel will reappear later in the context of Abel 
summability of the Fourier series of a function. 

Let us return to the problem formulated earlier. The definition of 
the Fourier series of / is purely formal, and it is not obvious whether it 
converges to f. In fact, the solution of this problem can be very hard, 
or relatively easy, depending on the sense in which we expect the series 
to converge, or on what additional restrictions we place on /. 

Let us be more precise. Suppose, for the sake of this discussion, that 
the function / (which is always assumed to be Riemann integrable) is 
defined on [—7r, 7r]. The first question one might ask is whether the partial 
sums of the Fourier series of / converge to / pointwise. That is, do we 
have 

(1) lim S]\f(f)(0) = f(6) for every 61 

AT—^oo 

We see quite easily that in general we cannot expect this result to be 
true at every 0, since we can always change an integrable function at one 
point without changing its Fourier coefficients. As a result, we might 
ask the same question assuming that / is continuous and periodic. For 
a long time it was believed that under these additional assumptions the 
answer would be “yes.” It was a surprise when Du Bois-Reymond showed 
that there exists a continuous function whose Fourier series diverges at 
a point. We will give such an example in the next chapter. Despite this 
negative result, we might ask what happens if we add more smoothness 
conditions on /: for example, we might assume that / is continuously 
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differentiable, or twice continuously differentiable. We will see that then 
the Fourier series of / converges to / uniformly. 

We will also interpret the limit (1) by showing that the Fourier series 
sums, in the sense of Cesaro or Abel, to the function / at all of its points 
of continuity. This approach involves appropriate averages of the partial 
sums of the Fourier series of /. 

Finally, we can also define the limit (1) in the mean square sense. In 
the next chapter, we will show that if / is merely integrable, then 

^r~ [ \SN{f){0) — f(9)\ 2 d0 —>• 0 as TV —> oo. 

It is of interest to know that the problem of pointwise convergence of 
Fourier series was settled in 1966 by L. Carleson, who showed, among 
other things, that if / is integrable in our sense, 4 then the Fourier series 
of / converges to / except possibly on a set of “measure 0.” The proof 
of this theorem is difficult and beyond the scope of this book. 

2 Uniqueness of Fourier series 

If we were to assume that the Fourier series of functions / converge to / 
in an appropriate sense, then we could infer that a function is uniquely 
determined by its Fourier coefficients. This would lead to the following 
statement: if / and g have the same Fourier coefficients, then / and g 
are necessarily equal. By taking the difference / — 仏 this proposition 
can be reformulated as: if /(n) = 0 for all n G Z, then / = 0. As stated, 
this assertion cannot be correct without reservation, since calculating 
Fourier coefficients requires integration, and we see that, for example, 
any two functions which differ at finitely many points have the same 
Fourier series. However, we do have the following positive result. 

Theorem 2.1 Suppose that f is an integrable function on the circle with 
f{n) = 0 for all n G Z. Then /(0o) = 0 whenever f is continuous at the 
point 0q. 

Thus, in terms of what we know about the set of discontinuities of in¬ 
tegrable functions, 5 we can conclude that / vanishes for “most” values 
of 6>. 

Proof. We suppose first that / is real-valued, and argue by con¬ 
tradiction. Assume, without loss of generality, that / is defined on 


4 Carleson’s proof actually holds for the wider class of functions which are square inte¬ 
grable in the Lebesgue sense. 

5 See the appendix. 
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[—7r, 7r], that 0o = 0, and /(0) > 0. The idea now is to construct a fam¬ 
ily of trigonometric polynomials {pk} that “peak” at 0, and so that 
f Pk{0)f(6) d6 ^ oo as fc —>• oo. This will be our desired contradiction 
since these integrals are equal to zero by assumption. 

Since / is continuous at 0, we can choose 0 < 5 < 7r/2, so that f(6) > 
/(0)/2 whenever \9\ < 5. Let 

p(0) = e + cos 0, 

where e > 0 is chosen so small that \p(0)\ < 1 — e/2, whenever 5 < |0| < 
7r. Then, choose a positive r] with r] < 5, so that p(6) > 1 + e/2, for 
\6\ < rj. Finally, let 

Pk(e) 二 [ P (0)] k , 

and select B so that \f(0)\ < B for all 9. This is possible since / is 
integrable, hence bounded. Figure 3 illustrates the family {pk}- By 



Figure 3. The functions p, pq, and pi^ when e = 0.1 


construction, each is a trigonometric polynomial, and since f (n) = Q 
for all n, we must have 

f f(0)pk(0) dO = 0 for all k. 

J —TV 

However, we have the estimate 


f s<\e\ 


f( 0 ) Pk (d)de 


< 2nB(l - e/2) k . 












Ibookroot October 20, 2007 


2. Uniqueness of Fourier series 41 

Also, our choice of 5 guarantees that p(0) and f(6) are non-negative 
whenever \6\ < 5, thus 


Finally, 


lri<\e\<s 


f{e) Pk (e) de>o. 




Therefore, / ^ oo as fc ^ oo, and this concludes the proof 
when / is real-valued. In general, write f(0) = u(0) + iv(9), where u and 
v are real-valued. If we define f(6) = f(0), then 


u(0)= 


m+m 

2 


and 


v(0 )= 


m-7 ⑼ 

2i 


and since /(n) = /(—n), we conclude that the Fourier coefficients of u 
and v all vanish, hence / = 0 at its points of continuity. The idea 

of constructing a family of functions (trigonometric polynomials in this 
case) which peak at the origin, together with other nice properties, will 
play an important role in this book. Such families of functions will be 
taken up later in Section 4 in connection with the notion of convolution. 
For now, note that the above theorem implies the following. 

Corollary 2.2 If f is continuous on the circle and f(n) = 0 for all 
n G Z ，then / = 0. 

The next corollary shows that the problem (1) formulated earlier has a 
simple positive answer under the assumption that the series of Fourier 
coefficients converges absolutely. 

Corollary 2.3 Suppose that f is a continuous function on the circle and 
that the Fourier series of f is absolutely convergent, J2^L-oo l/( n )l < 00 . 
Then, the Fourier series converges uniformly to f, that is, 

lim 5 at(/)(0) = f(9) uniformly in 9. 

N—^oo 

Proof. Recall that if a sequence of continuous functions converges 
uniformly, then the limit is also continuous. Now observe that the 
assumption ^ |/(n)| < oo implies that the partial sums of the Fourier 
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series of / converge absolutely and uniformly, and therefore the function 
g defined by 


m 二 它 h— ine 

n=—oo 


N 


lim 

iV—>-oo 


E h— ine 

n=—N 


is continuous on the circle. Moreover, the Fourier coefficients of p are 
precisely f(n) since we can interchange the infinite sum with the integral 
(a consequence of the uniform convergence of the series). Therefore, the 
previous corollary applied to the function f — g yields f = g, as desired. 

What conditions on f would guarantee the absolute convergence of its 

Fourier series? As it turns out, the smoothness of / is directly related 
to the decay of the Fourier coefficients, and in general, the smoother the 
function, the faster this decay. As a result, we can expect that relatively 
smooth functions equal their Fourier series. This is in fact the case, as 
we now show. 

In order to state the result concisely we introduce the standard “O” 
notation, which we will use freely in the rest of this book. For exam¬ 
ple, the statement f{n) = 0(l/|n| 2 ) as |n| —• oo, means that the left- 
hand side is bounded by a constant multiple of the right-hand side; 
that is, there exists C > 0 with |/(n)| < C/\n\ 2 for all large |n|_ More 
generally, f(x) = 0(g(x)) as x —>• a means that for some constant C, 
\f(x)\ < C\g(x)\ as x approaches a. In particular, f(x) = 0(1) means 
that / is bounded. 


Corollary 2.4 Suppose that f is a twice continuously differentiable func¬ 
tion on the circle. Then 


f{n) = 0(l/|n| 2 ) as \n\ —»• oo ; 


so that the Fourier series of f converges absolutely and uniformly to f. 
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Proof. The estimate on the Fourier coefficients is proved by integrating 
by parts twice for n ^ 0. We obtain 


r 2ir 


2nf(n) m / f(9)e~ ine dd 


m 


-e 


— in6 


2tt 


c 2tt 



in 

- 0 

1 

/»27T 

m 

e- ine 

in J 

0 


1 

m ~~ 

^—inO 

in 


in 


in , 


2tt 


f\d)e- ine d0 


r 2n 


(in ) 2 Jo 


f"(d)e~ ine dd 


r 2n 


n 2 Jo 


f"{e)e~ ind de. 

The quantities in brackets vanish since / and f are periodic. Therefore 


27r|n| 2 |/(n)| < 


广 2tt 


f"{e)e~ ine de 


r 2n 


< 


\f"(e)\de<c, 


where the constant C is independent of n. (We can take C = 2ttB where 
B is a bound for /’’•）Since 1/n 2 converges, the proof of the corollary 
is complete. 

Incidentally, we have also established the following important identity: 
f’(n) = inf(n), for all n G Z. 


If n 7^ 0 the proof is given above, and if n = 0 it is left as an exercise to the 
reader. So if / is differentiable and / 〜 a n e in6 , then a n ine inG . 

Also, if / is twice continuously differentiable, then f” 〜 J2 « n (m) 2 e m0 , 
and so on. Further smoothness conditions on / imply even better decay 
of the Fourier coefficients (Exercise 10). 

There are also stronger versions of Corollary 2.4. It can be shown, for 
example, that the Fourier series of / converges absolutely, assuming only 
that / has one continuous derivative. Even more generally, the Fourier 
series of / converges absolutely (and hence uniformly to /) if / satisfies 
a Holder condition of order a, with a > 1/2, that is, 

sup \f(6 + t) — f(0)\ < A\t\ a for all t. 
e 

For more on these matters, see the exercises at the end of Chapter 3. 
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At this point it is worthwhile to introduce a common notation: we say 
that / belongs to the class C k if / is times continuously differentiable. 
Belonging to the class C k or satisfying a Holder condition are two possible 
ways to describe the smoothness of a function. 

3 Convolutions 

The notion of convolution of two functions plays a fundamental role in 
Fourier analysis; it appears naturally in the context of Fourier series but 
also serves more generally in the analysis of functions in other settings. 

Given two 27r-periodic integrable functions / and g on ]R, we define 
their convolution f ^ g on [—7r, tt] by 

(2) (f * 9)(x ) 二 1 j f(y)g(x-y)dy. 

The above integral makes sense for each x, since the product of two 
integrable functions is again integrable. Also, since the functions are 
periodic, we can change variables to see that 

1 r 

(f * 9 )(x) 二^ ； J f(x-y)g(y)dy. 

Loosely speaking, convolutions correspond to “weighted averages.” For 
instance, if ^ = 1 in (2), then f 木 g is constant and equal to f(y) dy, 

which we may interpret as the average value of / on the circle. Also, the 
convolution (/ * g)(x) plays a role similar to, and in some sense replaces, 
the pointwise product f(x)g(x) of the two functions / and g. 

In the context of this chapter, our interest in convolutions originates 
from the fact that the partial sums of the Fourier series of / can be 
expressed as follows: 

N 

M /) (和 E f» inx 

n=—N 



=(/ * D n )(x), 
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where Dn is the N th Dirichlet kernel (see Example 4) given by 


N 

D N (x)= einx - 

n=—N 

So we observe that the problem of understanding Sn(J) reduces to the 
understanding of the convolution / * Dn ， 


We begin by gathering some of the main properties of convolutions. 

Proposition 3.1 Suppose that f, g, and h are 2 tt -periodic integrable 
functions. Then: 

(i) f*(9 + h) = (f*g) + {f*h、_ 

(ii) (c/) * g 二 c(f * g) = f * (eg) for any ceC. 

(iii) f * g = g* f. 

(iv) [f *g)*h 二 f *[g*h). 

(v) f ^ g is continuous. 

(vi) f *g{n) = f(n)g(n). 

The first four points describe the algebraic properties of convolutions: 
linearity, commutativity, and associativity. Property (v) exhibits an im¬ 
portant principle: the convolution of / * 沒 is “more regular” than / or g. 
Here, / * p is continuous while / and g are merely (Riemann) integrable. 
Finally, (vi) is key in the study of Fourier series. In general, the Fourier 
coefficients of the product fg are not the product of the Fourier coeffi¬ 
cients of / and g. However, (vi) says that this relation holds if we replace 
the product of the two functions / and g by their convolution f ^ g. 


Proof. Properties (i) and (ii) follow at once from the linearity of the 
integral. 

The other properties are easily deduced if we assume also that / and 
g are continuous. In this case, we may freely interchange the order of 
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integration. For instance, to establish (vi) we write 


? * g(n ) 二 i J (f * g)(x)e~ mx dx 


2tt ' 

It. 

I, 


27T 

f(y)e- iny 


f{y)g(x - y) dyj e- mx dx 

g(x — y)e_ m ( x _ v ) dx^j dy 


f(y)e 


—my 


2tt 


2tt 


g(x)e~ 17lx dx ) dy 


f(n)g(n). 


To prove (iii), one first notes that if F is continuous and 27r-periodic, 
then 

/»7T 广 7T 

/ F(y) dy = F{x — y) dy for any a; G M. 


The verification of this identity consists of a change of variables y i—> — 
followed by a translation y y — x. Then, one takes F(y) = f(y)g(x — y) 
Also, (iv) follows by interchanging two integral signs, and an appro¬ 
priate change of variables. 

Finally, we show that if / and g are continuous, then f ^ g is continu¬ 
ous. First, we may write 


if*g)(xi) - (f*g)(x 2 ) 


2tt 


f{y) [g(xi -y)- g(x 2 - y)\ dy. 


Since g is continuous it must be uniformly continuous on any closed 
and bounded interval. But g is also periodic, so it must be uniformly 
continuous on all of R; given e > 0 there exists 5 > 0 so that ^( 5 ) — 
g{t)\ < e whenever \s — t\ < 6 . Then, \xi — X 2 \ < S implies \(xi — y)— 
(X 2 — y)\ < 5 for any y, hence 


1(/* 分 )Oi) — < 


< 


< 


2?r 

1 

2tt 

e 

2?r 


f(y) [ 9 (x 1 -y)~ g{x 2 - y)\ dy 

TT 

\f(y)\ b ( 工 1 ~y) - 9 (x 2 -y)\dy 
\f(y)\dy 
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where B is chosen so that |/(x)| < B for all x. As a result, we conclude 
that / * p is continuous, and the proposition is proved, at least when / 
and g are continuous. 

In general, when / and g are merely integrable, we may use the re¬ 
sults established so far (when / and g are continuous), together with 
the following approximation lemma, whose proof may be found in the 
appendix. 

Lemma 3.2 Suppose f is integrable on the circle and bounded by B. 
Then there exists a sequence {fk}^=i of continuous functions on the 
circle so that 


and 


sup \fk(x)\ < B for all k = 1,2,..., 

xe[—n,7r] 



I/O) - fk{x)\ dx->0 


as k ^ oo. 


Using this result, we may complete the proof of the proposition as 
follows. Apply Lemma 3.2 to / and g to obtain sequences {fk} and {gk} 
of approximating continuous functions. Then 

f *9~ fk*9k^{f - fk)*9 + fk*{g- 9k)- 
By the properties of the sequence {fk}, 

1 (/ - fk) * g(x)\ < ^： y \f(x -y) - fk{x-y)\ \g(y)\dy 

< sup \g(y)\ [ \f(y) - h(y)\dy 

27T y J — 7T 

—^ 0 as fc —> oo. 


Hence (/ — fk) * ^ ^ 0 uniformly in x. Similarly, fk^ {g ~ 9k) 0 uni¬ 

formly, and therefore fk * gk tends uniformly to f ^ g. Since each 
is continuous, it follows that / * 分 is also continuous, and we have (v). 

Next, we establish (vi). For each fixed integer n we must have 
———— — ^-—- 

fk * gk(ji) —>• / * g{n) as k tends to infinity since fk * gk converges uni- 
formly to f ^ g. However, we found earlier that /fc(n)^(n) = /fc * gk{p) 
because both 九 and gk are continuous. Hence 


1/0} — A 0)1 = 


2vr 


- 2 ^ 



(/ ㈦- f k (x))e~ inx 
\f(x) - f k (x)\dx, 


dx 
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and as a result we find that /fc(n) —>• f{n) as k goes to infinity. Similarly 
^(n) —>• g(n), and the desired property is established once we let k tend 
to infinity. Finally, properties (iii) and (iv) follow from the same kind of 
arguments. 

4 Good kernels 

In the proof of Theorem 2.1 we constructed a sequence of trigonometric 
polynomials {p/c} with the property that the functions pk peaked at the 
origin. As a result, we could isolate the behavior of / at the origin. In 
this section, we return to such families of functions, but this time in a 
more general setting. First, we define the notion of good kernel, and 
discuss the characteristic properties of such functions. Then, by the use 
of convolutions, we show how these kernels can be used to recover a given 
function. 

A family of kernels {K n {x)}^ =1 on the circle is said to be a family of 
good kernels if it satisfies the following properties: 

(a) For all n > 1, 

1 f n 

^ J K n {x)dx = 1. 

(b) There exists M > 0 such that for all n > 1, 

\K n (x) \ dx < M. 

(c) For every 5 > 0, 

/ \K n (x) \ dx —> 0, as n —^ oo. 

J 5<|x|<7r 

In practice we shall encounter families where K n {x) > 0, in which 
case (b) is a consequence of (a). We may interpret the kernels K n (x) 
as weight distributions on the circle: property (a) says that K n assigns 
unit mass to the whole circle [—7r, 7r], and (c) that this mass concentrates 
near the origin as n becomes large. 6 Figure 4 (a) illustrates the typical 
character of a family of good kernels. 

The importance of good kernels is highlighted by their use in connec¬ 
tion with convolutions. 



6 In the limit, a family of good kernels represents the “Dirac delta function.” This 
terminology comes from physics. 
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Figure 4. Good kernels 


Theorem 4.1 Let {i^ n }^ =1 be a family of good kernels，and f an inte- 
grable function on the circle. Then 

lim (f *K n )(x ) 二 f(x) 

n—>-oo 

whenever f is continuous at x. If f is continuous everywhere，then the 
above limit is uniform. 

Because of this result, the family {K n } is sometimes referred to as an 

approximation to the identity. 

We have previously interpreted convolutions as weighted averages. In 
this context, the convolution 

(/* I<n){x) 二 1 j f(x - y)K n (y)dy 

is the average of f(x — y), where the weights are given by K n (y). How¬ 
ever, the weight distribution K n concentrates its mass at y = 0 as n 
becomes large. Hence in the integral, the value f(x) is assigned the full 
mass as n — > oo. Figure 4 (b) illustrates this point. 

Proof of Theorem J^.l. If e > 0 and / is continuous at x, choose 5 so 
that \y\ < 5 implies \ f(x — y) — f(x)\ < e. Then, by the first property of 
good kernels, we can write 

(/ * K n )(x) - f(x) 二 —j K n (y)f{x -y)dy- f(x) 

-~ f K n (y)[f(x - y) - f(x)]dy. 


— 7T 
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Hence, 


\{f ^ K n ){x) - f{x)\ 


2tt, 


< 


2 丌 J\y\<S 
] 

+ 


K n(y)[f( x - y) - f( x )] d v 

\ K niy)\ \f{x-y) - f(x)\dy 

\K n (y)\\f(x - y) - f(x)\dy 




2tt 


f ^<\y\<T 


4 


where B is a bound for f. The first term is bounded by eM/2^ because 
of the second property of good kernels. By the third property we see 
that for all large n, the second term will be less than e. Therefore, for 
some constant C > 0 and all large n we have 


\(f*K n )(x) - f(x)\ < Ce, 

thereby proving the first assertion in the theorem. If / is continuous 
everywhere, then it is uniformly continuous, and 5 can be chosen in¬ 
dependent of x. This provides the desired conclusion that / * K n —^ / 
uniformly. 

Recall from the beginning of Section 3 that 


5W(/) ⑷ =(/ * D n ){x ), 

where -Dat(x) = ^2^=-n e%riX the Dirichlet kernel. It is natural now for 
us to ask whether Dn is a good kernel, since if this were true, Theorem 4.1 
would imply that the Fourier series of / converges to f(x) whenever / is 
continuous at x. Unfortunately, this is not the case. Indeed, an estimate 
shows that Djsr violates the second property; more precisely, one has (see 
Problem 2) 



\Dn(x)\ dx > c log TV, 


as TV —>■ oo. 


However, we should note that the formula for Dn as a sum of exponen¬ 
tials immediately gives 


2tt 



Djst(x) dx = 1, 


so the first property of good kernels is actually verified. The fact that the 
mean value of is 1, while the integral of its absolute value is large, 
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is a result of cancellations. Indeed, Figure 5 shows that the function 
Dn{x) takes on positive and negative values and oscillates very rapidly 
as N gets large. 



Figure 5. The Dirichlet kernel for large N 

This observation suggests that the pointwise convergence of Fourier 
series is intricate, and may even fail at points of continuity. This is 
indeed the case, as we will see in the next chapter. 

5 Cesaro and Abel summability: applications to Fourier 
series 

Since a Fourier series may fail to converge at individual points, we are 
led to try to overcome this failure by interpreting the limit 

Jim S N (f) = f 

N—oo 


in a different sense. 

5.1 Cesaro means and summation 

We begin by taking ordinary averages of the partial sums, a technique 
which we now describe in more detail. 
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Suppose we are given a series of complex numbers 

oo 

Co + Cl + C2 + … =Cfc. 

k=0 

We define the n th partial sum s n by 

n 

Sn = 〉 : Ck ， 
k=0 

and say that the series converges to s if lim n _^ s n = s. This is the 
most natural and most commonly used type of “summability.” Consider, 
however, the example of the series 

oo 

(3) 1-1 + 1-1 + …. = [(—l) fc . 

k=0 

Its partial sums form the sequence {1,0,1,0,...} which has no limit. 
Because these partial sums alternate evenly between 1 and 0, one might 
therefore suggest that 1/2 is the “limit” of the sequence, and hence 1/2 
equals the “sum” of that particular series. We give a precise meaning to 
this by defining the average of the first N partial sums by 

+ Si + … + 5JV-1 


The quantity cjn is called the N th Cesaro mean 7 of the sequence {sk} 
or the iV th Cesaro sum of the series Ck. 

If (Tn converges to a limit a as TV tends to infinity, we say that the 
series E c n is Cesaro summable to a. In the case of series of functions, 
we shall understand the limit in the sense of either pointwise or uniform 
convergence, depending on the situation. 

The reader will have no difficulty checking that in the above exam¬ 
ple (3), the series is Cesaro summable to 1/2. Moreover, one can show 
that Cesaro summation is a more inclusive process than convergence. In 
fact, if a series is convergent to 5, then it is also Cesaro summable to the 
same limit s (Exercise 12). 


5.2 Fejer’s theorem 

An interesting application of Cesaro summability appears in the context 
of Fourier series. 


7 Note that if the series Cfc begins with the term A: = 1, then it is common prac¬ 

tice to define ctn = (si + •.. + sn)/N. This change of notation has little effect on what 
follows. 
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We mentioned earlier that the Dirichlet kernels fail to belong to the 
family of good kernels. Quite surprisingly, their averages are very well 
behaved functions, in the sense that they do form a family of good ker¬ 
nels. 

To see this, we form the N th Cesaro mean of the Fourier series, which 
by definition is 


a N (f)(x) ^ S 0 {f)(x) + + 

Since S n (f) = f * D n , we find that 


0"iv(/) ㈤ = (f*F N )(x), 


where Fn(x) is the N-th. Fejer kernel given by 


j-, / x Do(x) + … + Dn-i(x) 
上 jv ⑻ = ^ 


Lemma 5.1 We have 


Fn(x)= 


1 sin 2 (Nx/2) 
N sin 2 (x/2) 


and the Fejer kernel is a good kernel. 

The proof of the formula for (a simple application of trigonometric 
identities) is outlined in Exercise 15. To prove the rest of the lemma, note 
that Fn is positive and 士 /:开 F^{x) dx = 1, in view of the fact that a 
similar identity holds for the Dirichlet kernels D n . However, sin 2 (x/2) > 
c ,5 > 0, if 5 < \x\ < 7r, hence Fjsf(x) < l/(Ncs), from which it follows that 

/ \Fjsf(x) \ dx ^ 0 as iV —> oo. 

J 6<|a;|<7r 

Applying Theorem 4.1 to this new family of good kernels yields the 
following important result. 

Theorem 5.2 If f is integrable on the circle, then the Fourier series of 
f is Cesaro summable to f at every point of continuity of f. 

Moreover, if f is continuous on the circle, then the Fourier series of 
f is uniformly Cesaro summable to /. 

We may now state two corollaries. The first is a result that we have 
already established. The second is new, and of fundamental importance. 
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Corollary 5.3 If f is integrable on the circle and f(n) = 0 for all n, 
then f = 0 at all points of continuity of f. 

The proof is immediate since all the partial sums are 0, hence all the 
Cesaro means are 0. 

Corollary 5.4 Continuous functions on the circle can be uniformly ap¬ 
proximated by trigonometric polynomials. 

This means that if / is continuous on [—7r, 7r] with /(—7r) = f(7r) and 
e > 0, then there exists a trigonometric polynomial P such that 

\f(x) — P{x)\ < e for all — 7r < x < 7r. 

This follows immediately from the theorem since the partial sums, hence 
the Cesaro means, are trigonometric polynomials. Corollary 5.4 is the 
periodic analogue of the Weierstrass approximation theorem for polyno¬ 
mials which can be found in Exercise 16. 


5.3 Abel means and summation 


Another method of summation was first considered by Abel and actually 
predates the Cesaro method. 

A series of complex numbers is said to be Abel summable 

to s if for every 0 < r < 1, the series 


A(r) = ^ c k r k 
k=0 

converges, and 

lim A{r) = s. 


The quantities A(r) are called the Abel means of the series. One can 
prove that if the series converges to 5, then it is Abel summable to s. 
Moreover, the method of Abel summability is even more powerful than 
the Cesaro method: when the series is Cesaro summable, it is always 
Abel summable to the same sum. However, if we consider the series 


1-2 + 3-4 + 5 —— =E(-l) fc (fc+1), 

k=0 

then one can show that it is Abel summable to 1/4 since 

OO 1 

则二 E (- 心 + 

k=0 V 丁 } 

but this series is not Cesaro summable; see Exercise 13. 
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5.4 The Poisson kernel and Dirichlet’s problem in the unit disc 

To adapt Abel summability to the context of Fourier series, we define 
the Abel means of the function /(0) 〜 J2^L-oo a n^ in6 by 


oo 

E r^a n e ine . 

n=—oo 


Since the index n takes positive and negative values, it is natural to write 
Co = ao, and c n = a n e ine + a_ n e _m6> for n > 0, so that the Abel means 
of the Fourier series correspond to the definition given in the previous 
section for numerical series. 

We note that since / is integrable, \a n \ is uniformly bounded in n, so 
that A r (f) converges absolutely and uniformly for each 0 < r < 1. Just 
as in the case of Cesaro means, the key fact is that these Abel means can 
be written as convolutions 


Mfm = (f * p r m, 

where P r (0) is the Poisson kernel given by 

oo 

(4) P r {0)^ J2 rlnleind - 

n=—oo 

In fact, 

oo 

E r^a n e ine 



where the interchange of the integral and infinite sum is justified by the 
uniform convergence of the series. 

Lemma 5.5 // 0 < r < 1 , then 


尸’ ⑼ 1 — 2r cos 9 -\-r 2 
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The Poisson kernel is a good kernel, 8 as r tends to 1 from below. 

Proof. The identity P r (0) = i_ 2 rcose-\-r 2 ^ as already been derived in 
Section 1.1. Note that 

1 — 2r cos 9 + r 2 = (1 — r) 2 + 2r(l — cos 6) . 

Hence ifl/2<r<l and 5 < |0| < 7r, then 

1 — 2r cos 6 r 2 > cs > 0. 

Thus P r (0) < (1 — r 2 )/cs when 5 < |0| < 7r, and the third property of 
good kernels is verified. Clearly P r (0) > 0, and integrating the expres¬ 
sion (4) term by term (which is justified by the absolute convergence of 
the series) yields 

J Pr{0) d6 = 1, 

thereby concluding the proof that P r is a good kernel. 

Combining this lemma with Theorem 4.1, we obtain our next result. 

Theorem 5.6 The Fourier series of an integrable function on the circle 
is Abel summable to f at every point of continuity. Moreover, if f is 
continuous on the circle, then the Fourier series of f is uniformly Abel 
summable to f • 

We now return to a problem discussed in Chapter 1, where we sketched 
the solution of the steady-state heat equation Au = 0 in the unit disc 
with boundary condition u = f on the circle. We expressed the Laplacian 
in terms of polar coordinates, separated variables, and expected that a 
solution was given by 

oo 

(5) u(r ， e)= I 

m=—oo 

where a m was the m th Fourier coefficient of /. In other words, we were 
led to take 

命 ,0) = A r (f)(e ) 二 I j f(^p)Pr(0 - tp) dip. 

We are now in a position to show that this is indeed the case. 


8 In this case, the family of kernels is indexed by a continuous parameter 0 < r < 1, 
rather than the discrete n considered previously. In the definition of good kernels, we 
simply replace n by r and take the limit in property (c) appropriately, for example r ^ 1 
in this case. 
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Theorem 5.7 Let f be an integrable function defined on the unit circle. 
Then the function u defined in the unit disc by the Poisson integral 

⑹ u(r,0) = (f*P r )(0) 

has the following properties: 

(i) u has two continuous derivatives in the unit disc and satisfies 
Au = 0. 

(ii) If 6 is any point of continuity of f, then 

lim u(r, 0) = f(0). 

r—^1 

If f is continuous everywhere, then this limit is uniform. 

(iii) If f is continuous, then u(r, 6) is the unique solution to the steady- 
state heat equation in the disc which satisfies conditions (i) and (ii). 

Proof. For (i), we recall that the function u is given by the series (5). 
Fix p < 1; inside each disc of radius r < p < 1 centered at the origin, the 
series for u can be differentiated term by term, and the differentiated se¬ 
ries is uniformly and absolutely convergent. Thus u can be differentiated 
twice (in fact infinitely many times), and since this holds for all p < 1, 
we conclude that u is twice differentiable inside the unit disc. Moreover, 
in polar coordinates, 


d 2 u 1 du 1 d 2 u 
U dr 2 + r + r 2 ’ 

so term by term differentiation shows that Au = 0. 

The proof of (ii) is a simple application of the previous theorem. To 
prove (iii) we argue as follows. Suppose v solves the steady-state heat 
equation in the disc and converges to / uniformly as r tends to 1 from 
below. For each fixed r with 0 < r < 1, the function has a Fourier 

series 


a n (r)e inG where a n (r) = — / v(r, 6)e~ ine d9. 
^ 27T J—t 


Taking into account that v(r, 6) solves the equation 

d 2 v 1 dv 1 d 2 v 
dr 2 + r dr r 2 d6 2 ’ 


⑺ 
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we find that 

1 n 2 

⑻ a ， hiT) + ~ a n( r ) - ~^ a n(r) = 0 . 

Indeed, we may first multiply (7) by e~ ine and integrate in 6. Then, 
since v is periodic, two integrations by parts give 

°) e ~ inl>d6 - i 2 »n(r). 

Finally, we may interchange the order of differentiation and integra¬ 
tion, which is permissible since v has two continuous derivatives; this 
yields ( 8 ). 

Therefore, we must have a n {r) = A n r n + B n r_ n for some constants 
A n and B n , when n 7 ^ 0 (see Exercise 11 in Chapter 1). To evaluate the 
constants, we first observe that each term a n (r) is bounded because v is 
bounded, therefore B n = 0. To find A n we let r —> 1 . Since v converges 
uniformly to / as r —> 1 we find that 

An= L F_ _ e — ine ‘ 

By a similar argument, this formula also holds when n = 0. Our con¬ 
clusion is that for each 0 < r < 1, the Fourier series of v is given by the 
series of u(r, 6), so by the uniqueness of Fourier series for continuous 
functions, we must have u = v. 

Remark. By part (iii) of the theorem, we may conclude that if u 
solves Au = 0 in the disc, and converges to 0 uniformly as r —> 1 , then 
u must be identically 0. However, if uniform convergence is replaced by 
pointwise convergence, this conclusion may fail; see Exercise 18. 


6 Exercises 


1. Suppose / is 27r-periodic and integrable on any finite interval. Prove that if 
a, 6 G M, then 


nb-\-2n 


nb—2-K 


f{x) dx 


f(x) dx ■ 


f(x) dx. 


' a+27r 


f(x -\- a)dx 


f(x) dx ■ 


/ »7r+a 


-7r+a 


f{x) dx. 


Also prove that 
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2. In this exercise we show how the symmetries of a function imply certain 
properties of its Fourier coefficients. Let / be a 27r-periodic Riemann integrable 
function defined on R. 

(a) Show that the Fourier series of the function / can be written as 

/(0 ) 〜 / ⑼ + y^[/(n) + /(-n)] cosn0 + i[f{n) - /(-n)] smnO. 

n>l 

(b) Prove that if / is even, then /(n) = /(—n), and we get a cosine series. 

(c) Prove that if / is odd, then f(n) — —/(—n), and we get a sine series. 

(d) Suppose that f(6 + 7r) = f(6) for all ^ G M. Show that /(n) = 0 for all 
odd n. 

(e) Show that / is real-valued if and only if f(n) = /(—n) for all n. 


3. We return to the problem of the plucked string discussed in Chapter 1. Show 
that the initial condition / is equal to its Fourier sine series 



sin mx 


2h sinmp 
Wlt Am= rv? p(n - p) 


[Hint: Note that \A m \ < C/m 2 .] 

4. Consider the 27r-periodic odd function defined on [0, tt] by f(0) = 0(n — 0). 

(a) Draw the graph of /. 

(b) Compute the Fourier coefficients of /, and show that 


m = 


8 

7T 


E 

k odd > 1 


sin k6 
k 3 


5. On the interval [—7r, tt] consider the function 


m 二 


o if \e\ > s, 

1- \0\/S if |^| < s. 


Thus the graph of / has the shape of a triangular tent. Show that 


6. Let / be the function defined on [—7r, tt] by f(0) = \6 
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(a) Draw the graph of /. 

(b) Calculate the Fourier coefficients of /, and show that 

if n = 0, 

if n ^ 0. 


• 7T 
2 


/ ⑹ 


-l + (-l) n 

7m 2 


(c) What is the Fourier series of / in terms of sines and cosines? 

(d) Taking 沒 = 0, prove that 


E 

n odd >1 


n 2 



n=l 


See also Example 2 in Section 1.1. 

7. Suppose {a n }^ =1 and {b n }^ =1 are two finite sequences of complex numbers. 
Let Bk 二 X^n=i denote the partial sums of the series b n with the convention 
= 0. 

(a) Prove the summation by parts formula 

N N-l 

〉: ~ Qn B n _ -^M — l — 〉: (^n+l — a n )B n . 

n=M n=M 

(b) Deduce from this formula Dirichlet’s test for convergence of a series: if the 
partial sums of the series ^ b n are bounded, and {a n } is a sequence of 
real numbers that decreases monotonically to 0, then a n b n converges. 


1 ginx 

8. Verify that — - ^ the Fourier series of the 27r-periodic sawtooth 

function illustrated in Figure 6, defined by /(0) = 0, and 

{ 7T X 

~ 2 ~ 2 <x<0 ^ 

丌 X -c r, 

2 - 2 d0<x<7T - 

Note that this function is not continuous. Show that nevertheless, the series 
converges for every x (by which we mean, as usual, that the symmetric partial 
sums of the series converge). In particular, the value of the series at the origin, 
namely 0, is the average of the values of f(x) as x approaches the origin from 
the left and the right. 
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Figure 6. The sawtooth function 


[Hint: Use Dirichlet’s test for convergence of a series a n b n .] 

9. Let f(pc) — X[a,b]{ x ) be the characteristic function of the interval [a, b] C 
[—7r, 7r], that is, 


X[a,b]( X ) 


if x G [a, 6], 
otherwise. 


(a) Show that the Fourier series of / is given by 


/ ⑷ 


b — a 
2tt 


E: 

n^O 


-inb 


2ivin 


The sum extends over all positive and negative integers excluding 0. 

(b) Show that if a —7r ot b ^ tt and a + h 、then the Fourier series does not 
converge absolutely for any x. [Hint: It suffices to prove that for many 
values of n one has | sin uOq\ > c > 0 where Go 二 （p — a)/2.] 


(c) However, prove that the Fourier series converges at every point x. What 
happens if a = —tt and b — 


10. Suppose / is a periodic function of period 2 丌 which belongs to the class C k . 
Show that 

/(n) = 0(l/\n\ k ) as \n\ oo. 

This notation means that there exists a constant C such |/(n)| < C/\n\ k . We 
could also write this as \n\ k f(n) — 0(1), where 0(1) means bounded. 

[Hint: Integrate by parts.] 


11. Suppose that {fk}^=i is a sequence of Riemann integrable functions on the 
interval [0,1] such that 

I \fk{pc) — /(^)| ^ ^ 0 as & — oo. 
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Show that fk{ n ) f( n ) uniformly in n as A; —^ oo. 

12. Prove that if a series of complex numbers c n converges to 5, then c n 
is Cesaro summable to s. 

[Hint: Assume s n —>■ 0 as n —■ oo.] 

13. The purpose of this exercise is to prove that Abel summability is stronger 
than the standard or Cesaro methods of summation. 

(a) Show that if the series c n of complex numbers converges to a finite 

limit s, then the series is Abel summable to s. [Hint: Why is it enough to 
prove the theorem when s = 0? Assuming s = 0, show that if = ci + 
• • • + cat, then c n『 n = (1 — r) Y^n=i + s N r N+1 . Let N ^ oo 

to show that 

c n r n = (1 - r) s n r n . 

Finally, prove that the right-hand side converges to 0 as r —> 1.] 

(b) However, show that there exist series which are Abel summable, but that 
do not converge. [Hint: Try c n — (—l) n . What is the Abel limit of c n ?] 

(c) Argue similarly to prove that if a series c n is Cesaro summable to 

cr, then it is Abel summable to a. [Hint: Note that 

oo oo 

c n r n = (1 - r) 2 ^2 ncj nT n , 


and assume a — 0.] 

(d) Give an example of a series that is Abel summable but not Cesaro summable. 
[Hint: Try c n = (—l) n_1 n. Note that if is Cesaro summable, then 
Cn/n tends to 0.] 

The results above can be summarized by the following implications about 
series: 


convergent => Cesaro summable => Abel summable, 
and the fact that none of the arrows can be reversed. 

14. This exercise deals with a theorem of Tauber which says that under an 
additional condition on the coefficients c n , the above arrows can be reversed. 

(a) If ^2, Cn is Cesaro summable to a and c n = o(l/n) (that is, nc n —» 0), then 

^2 c n converges to a. [Hint: s n — cy n — [(n — l)c n H -+ C 2 ]/n.] 

(b) The above statement holds if we replace Cesaro summable by Abel summable. 

[Hint: Estimate the difference between X^=i c n and c n r n where 

r — 1 — 1/7V.] 








Ibookroot October 20, 2007 


6. Exercises 


63 


15. Prove that the Fejer kernel is given by 


Fn{oc )= 


1 sin 2 (iVx/2) 
N sin 2 (x/2) 


[Hint: Remember that NFn[ x ) 二 Dq(x) H — • + Dn-i{x) where D n {x) is the 
Dirichlet kernel. Therefore, if a; = e lx we have 

^z} (.)~ n — , ,^+i 
NF n (x) = ^2 - ~.] 

n=0 


16. The Weierstrass approximation theorem states: Let / be a continuous 
function on the closed and bounded interval [a, b] C M. Then, for any e > 0, 
there exists a polynomial P such that 

sup \f(x) - P{x)\ < e. 

x£[a,6] 

Prove this by applying Corollary 5.4 of Fejer’s theorem and using the fact that 
the exponential function e zx can be approximated by polynomials uniformly on 
any interval. 


17. In Section 5.4 we proved that the Abel means of / converge to / at all 
points of continuity, that is, 

lim A r (f)(6) = lim(P r * f){0) — f(0), with 0 < r < 1, 

r ― ►l r ― >1 

whenever / is continuous at 6. In this exercise, we will study the behavior of 
A r (f)(6) at certain points of discontinuity. 

An integrable function is said to have a jump discontinuity at 6 if the two 
limits 

lim f(6 -\-h) — f(0 + ) and lim f(6 — h) — f(0~) 

h — ► 0 h — * 0 

h > 0 h > 0 

exist. 

(a) Prove that if / has a jump discontinuity at 0, then 

lim A r (f){9) = /(m 6 * - ) , with 0 < r < 1. 

r—*l 2 

[Hint: Explain why ^ P r (6) d6 — P r (0) d6 — then modify 

the proof given in the text.] 
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(b) Using a similar argument, show that if / has a jump discontinuity at 
the Fourier series of / at ^ is Cesaro summable to ⑽ )=/( 0 ) _ 


18. If P r {0) denotes the Poisson kernel, show that the function 


u(r, 0) — 


dPr 

~d6' 


defined for 0 < r < 1 and ^ G M, satisfies: 

(i) Au = 0 in the disc. 

(ii) linv—i w(r, 沒 )= 0 for each 6. 
However, u is not identically zero. 


19. Solve Laplace’s equation Au = 0 in the semi infinite strip 

S = {(^, y) : 0 < x < 1, 0 < y}, 

subject to the following boundary conditions 

{ w(0, y) = 0 when 0 < y, 

u(l, y) = 0 when 0 < y, 

u(x, 0) = f(x) when 0 < x < 1 

where / is a given function, with of course f ⑼ = /(l) = 0. Write 

oo 

/(*)= n sin(n7rx) 

n=l 

and expand the general solution in terms of the special solutions given by 

u n (x, y) = e~ nny sin(n7rx). 

Express u as an integral involving /, analogous to the Poisson integral for¬ 
mula (6). 

20. Consider the Dirichlet problem in the annulus defined by {(r, 6) : p < r < 1}, 
where 0 < p < 1 is the inner radius. The problem is to solve 

d 2 u 1 du 1 d 2 u ^ 

- + - + - = 0 

dr 2 r dr r 2 d0 2 

subject to the boundary conditions 


u{p,9) = g(e)' 
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where / and g are given continuous functions. 

Arguing as we have previously for the Dirichlet problem in the disc, we can 
hope to write 

u(r, 6) = y^c n (r)e— 
with c n (r) = A n r n + B n r -n ， n / 0. Set 

m 〜 and a(o) - X ] Keine - 

We want c n (l) = a n and c n (p) = b n . This leads to the solution 

♦ e) = Y,{ 0 n \- n ) [(("十 _ (r/pDa„ + (r™- r~ n )b n } e in8 

n#0 \P P / 

+a 0 + (b 0 - a 0 )|^. 

logp 

Show that as a result we have 

u(r, 6) — (P r * /)( 沒 ） — 0 as r ^ 1 uniformly in 

and 

u(r, 6) — (Pp/r * P)( 沒 ）— 0 as r p uniformly in 6. 


7 Problems 

1. One can construct Riemann integrable functions on [0,1] that have a dense 
set of discontinuities as follows. 

(a) Let f(x) = 0 when x < 0, and f(x) = 1 if x > 0. Choose a countable dense 
sequence {r n } in [0,1]. Then, show that the function 

F ( x ) = J2^2f( x - r ^ 

n=l 

is integrable and has discontinuities at all points of the sequence {r n }. 
[Hint: F is monotonic and bounded.] 

(b) Consider next 

oo 

F(x) = ^2 ^~ n g{x-r n ), 

n=l 

where g(x) — sin 1/x when x ^ 0, and ^(0) = 0. Then F is integrable, 
discontinuous at each x = r n , and fails to be monotonic in any subinterval 
of [0,1]. [Hint: Use the fact that 3_ k > ^2 n>k 3 -n .] 
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(c) The original example of Riemann is the function 

(nx) 

n=l 

where (x) = a: for x G (—1/2,1/2] and (x) is continued to R by periodicity, 
that is, (rc + 1) = (x). It can be shown that F is discontinuous whenever 
x — m/2n, where m, n G Z with m odd and n ^ 0. 


2. Let Dn denote the Dirichlet kernel 


N 

Dn( 9)= J2 eik6 = 

k=-N 


sin((AT+l/2)6>) 

sin(^/2) 


and define 


(a) Prove that 


Ln = 



D N (0)\d0. 


Ln > clogN 

for some constant c > 0. [Hint: Show that |-Dat(^)| > c sm ((-^+V 2 )^) ’ change 
variables, and prove that 

⑴. 

Write the integral as a sum / fc (: + 1 )' To conclude, use the fact that 

l//c > clogn.] A more careful estimate gives 

L N 二 -^-log7V + 0(l). 


(b) Prove the following as a consequence: for each n > 1, there exists a contin¬ 
uous function f n such that |/ n | < 1 and |5 n (/ n )(0)| > c’logn. [Hint: The 
function g n which is equal to 1 when D n is positive and —1 when D n is 
negative has the desired property but is not continuous. Approximate g n 
in the integral norm (in the sense of Lemma 3.2) by continuous functions 
h k satisfying \h k \ < 1.] 


3.* Littlewood provided a refinement of Tauber’s theorem: 
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(a) If ^2 c n is Abel summable to s and c n — 0(l/n), then c n converges to 
s. 

(b) As a consequence, if c n is Cesaro summable to s and c n = 0(l/n), then 

c n converges to s. 

These results may be applied to Fourier series. By Exercise 17, they imply that 
if / is an integrable function that satisfies /(z/) = 0(1/|^|), then: 

(i) If / is continuous at 0, then 

s N(f)(0) f(0) as AT oo. 


(ii) If / has a jump discontinuity at 6, then 


s N (fm ^ 


/(#)+/ 旷) 

2 


as iV —> oo. 


(iii) If / is continuous on [—7r, 7r], then 5 at(/) ^ / uniformly. 

For the simpler assertion (b), hence a proof of (i), (ii), and (iii), see Problem 5 
in Chapter 4. 
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Convergence of Fourier Series 


The sine and cosine series, by which one can repre¬ 
sent an arbitrary function in a given interval, enjoy 
among other remarkable properties that of being con¬ 
vergent. This property did not escape the great ge¬ 
ometer (Fourier) who began, through the introduc¬ 
tion of the representation of functions just mentioned, 
a new career for the applications of analysis; it was 
stated in the Memoir which contains his first research 
on heat. But no one so far, to my knowledge, gave a 
general proof of it 


G. Dirichlet, 1829 


In this chapter, we continue our study of the problem of convergence 
of Fourier series. We approach the problem from two different points of 
view. 

The first is “global” and concerns the overall behavior of a function 
f over the entire interval [0, 2tt]. The result we have in mind is u mean- 
square convergence” ： if / is integrable on the circle, then 



At the heart of this result is the fundamental notion of “orthogonal- 
ity” ； this idea is expressed in terms of vector spaces with inner products, 
and their related infinite dimensional variants, the Hilbert spaces. A 
connected result is the Parseval identity which equates the mean-square 
“norm” of the function with a corresponding norm of its Fourier coeffi¬ 
cients. Orthogonality is a fundamental mathematical notion which has 
many applications in analysis. 

The second viewpoint is “local” and concerns the behavior of / near a 
given point. The main question we consider is the problem of point wise 
convergence: does the Fourier series of / converge to the value f(6) 
for a given 61 We first show that this convergence does indeed hold 
whenever / is differentiable at 6. As a corollary, we obtain the Riemann 
localization principle, which states that the question of whether or not 
SnU)^) /(0) is completely determined by the behavior of / in an 
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arbitrarily small interval about 6. This is a remarkable result since the 
Fourier coefficients, hence the Fourier series, of / depend on the values 
of / on the whole interval [0, 2tt]. 

Even though convergence of the Fourier series holds at points where 
/is differentiable, it may fail if / is merely continuous. The chapter 
concludes with the presentation of a continuous function whose Fourier 
series does not converge at a given point, as promised earlier. 

1 Mean-square convergence of Fourier series 

The aim of this section is the proof of the following theorem. 

Theorem 1.1 Suppose f is integrable on the circle. Then 
1 C 2lv 

/ |/(0) — S]\[(f)(0)\ 2 d0 ^ 0 as N — oo. 

Jo 

As we remarked earlier, the key concept involved is that of orthogonal¬ 
ity. The correct setting for orthogonality is in a vector space equipped 
with an inner product. 

1.1 Vector spaces and inner products 

We now review the definitions of a vector space over K. or C, an inner 
product, and its associated norm. In addition to the familiar finite¬ 
dimensional vector spaces and C d , we also examine two infinite- 
dimensional examples which play a central role in the proof of Theo¬ 
rem 1.1. 


Preliminaries on vector spaces 

A vector space V over the real numbers R is a set whose elements may be 
“added” together, and “multiplied” by scalars. More precisely, we may 
associate to any pair X,Y E V an element in V called their sum and 
denoted by X -\-Y. We require that this addition respects the usual laws 
of arithmetic, such as commutativity X -\-Y = Y + X, and associativity 
X -h (Y -h Z) = (X -h Y) -h Z, etc. Also, given any X ^ V and real num¬ 
ber A, we assign an element XX G V called the product of X by A. This 
scalar multiplication must satisfy the standard properties, for instance 
Ai(A 2 X) = (AiA 2 )X and 入 (X + Y) = AX + AY. We may instead allow 
scalar multiplication by numbers in C; we then say that y is a vector 
space over the complex numbers. 
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For example, the set of d-tuples of real numbers (xi,X 2 , •.., is 
a vector space over the reals. Addition is defined componentwise by 

(^i,.. • ， x d ) + (y!, = yd), 

and so is multiplication by a scalar 入 6 M: 

A(xi，• •. ， Xdj — (Axi ， • • • ， Xxdj. 

Similarly, the space C d (the complex version of the previous example) 
is the set of d-tuples of complex numbers (2 ： i, 么 2 , •.. ， ^d)- It is a vector 
space over C with addition defined componentwise by 

(^1, {wi ,. . . , W d ) = {zi + Wi,...,Z d + Wd). 

Multiplication by scalars A G C is given by 

= (Xzi,...,Xz d ). 

An inner product on a vector space V over R associates to any pair 
X,Y of elements in V a real number which we denote by (X, Y). In 
particular, the inner product must be symmetric (X, Y) = (Y - , X) and 
linear in both variables; that is, 

{aX + /?y, Z) = a(X, Z) + (3(Y, Z) 

whenever a, /? G M and X,Y^Z G V. Also, we require that the inner prod¬ 
uct be positive-definite, that is, (X^X) > 0 for all X mV. In particular, 
given an inner product we may define the norm of X by 

\\x\\^(x,xy/ 2 . 

If in addition ||X|| = 0 implies X = 0, we say that the inner product is 
strictly positive-definite. 

For example, the space M. d is equipped with a (strictly positive-definite) 
inner product defined by 

{X,Y) = Xim H - h XdVd 

when X = (xi, … ， Xd) and Y = ( 奶 ， • • • ， yd). Then 
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which is the usual Euclidean distance. One also uses the notation \X\ 
instead of ||X||. 

For vector spaces over the complex numbers, the inner product of two 
elements is a complex number. Moreover, these inner products are called 
Hermitian (instead of symmetric) since they must satisfy 
(X, Y) = (y, X). Hence the inner product is linear in the first variable, 
but conjugate-linear in the second: 


(aX + /?y, Z) = a(X, Z) + /3(F, Z) and 

(X, ay + pZ) = a(X, F) + Z). 

Also, we must have (X, X) > 0, and the norm of X is defined by 
||X|| = (X, X) 1 / 2 as before. Again, the inner product is strictly positive- 
definite if ||X|| = 0 implies X = 0. 

For example, the inner product of two vectors Z = (zi,, Zd) and 
W = (Wi ，..., Wd) in C d is defined by 

(Z, W) = z 1 W[+-- + z d w^ i . 

The norm of the vector Z is then given by 

|| Z | 卜 ( Z ， Z )" 2 = ^\ Zl \2 + ... + \ Zd \ 2 . 

The presence of an inner product on a vector space allows one to define 
the geometric notion of “orthogonality.” Let F be a vector space (over R 
or C) with inner product (., •) and associated norm || • ||. Two elements 
X and Y are orthogonal if (X, Y) = 0, and we write I 丄 Y". Three 
important results can be derived from this notion of orthogonality: 

(i) The Pythagorean theorem: if X and Y are orthogonal, then 

||x + y || 2 = ||x || 2 + ||y|| 2 . 

(ii) The Cauchy-Schwarz inequality: for any X^Y V we have 

|(X,F)|<||X|| ||F||. 

(iii) The triangle inequality: for any X,Y E ： V we have 


||x + y||<||x|| + ||y||. 
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The proofs of these facts are simple. For (i) it suffices to expand 
(X -\-Y,X -\-Y) and use the assumption that (X, Y) = 0. 

For (ii), we first dispose of the case when ||y|| =0 by showing that 
this implies (X, F) = 0 for all X. Indeed, for all real t we have 

0 < \\X + tY\\ 2 = ||Xf + 2tRe(X, Y) 

and Re(X, Y) ^ 0 contradicts the inequality if we take t to be large and 
positive (or negative). Similarly, by considering \\X + ity|| 2 , we find that 

im(x,yj = o. 

If ||y|| 0, we may set c = (X, Y)/(Y^ Y"); then X — cY" is orthogonal 

to y, and therefore also to cY. If we write X = X — cY + cY and apply 
the Pythagorean theorem, we get 

||X|| 2 = ||X-cFf + ||cFf>|c| 2 ||y|| 2 . 

Taking square roots on both sides gives the result. Note that we have 
equality in the above precisely when X = cY. 

Finally, for (iii) we first note that 

||x + yf = (x, X) + (X, Y) + (F, X) + (y, y). 

But (X^X) = ||X|| 2 , (y, Y) = ||y|| 2 , and by the Cauchy-Schwarz inequal¬ 
ity 

|(x,y) + (y,x)|<2||x| ||y||, 

therefore 

ll^ + ^ll 2 < ||x|| 2 + 2||x|| ||y|| + m 2 = (||x|| + ||y||) 2 . 


Two important examples 

The vector spaces and C d are finite dimensional. In the context 
of Fourier series, we need to work with two infinite-dimensional vector 
spaces, which we now describe. 

Example 1. The vector space 彳 2 (Z) over C is the set of all (two-sided) 
infinite sequences of complex numbers 

(•.. ， ft—n? • • • ， a— 1 ， d\ , ••” a n ,. •.) 


^ |o n | 2 < oo; 


such that 
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that is, the series converges. Addition is defined componentwise, and 
so is scalar multiplication. The inner product between the two vectors 
A = (… ， a_i, ao, ai,...) and B = (… ,b_i,bo, 6i, …） is defined by the 
absolutely convergent series 


(_A ， _S) = > : d n b n . 


The norm of A is then given by 


pii 二 （ w/ 2 = 



We must first check that £ 2 (Z) is a vector space. This requires that if 
A and B are two elements in £ 2 (Z), then so is the vector A-\- B. To see 
this, for each integer A/" > 0 we let denote the truncated element 


= (...,0,0, i , a_i, ciq? ai ， . • • ， 0,0,...), 

where we have set a n = 0 whenever \n\ > N. We define the truncated 
element Bn similarly. Then, by the triangle inequality which holds in a 
finite dimensional Euclidean space, we have 


114 + ~|| S II4II + M n ||S||. 


Thus 

E l^ + &n| 2 <(P|| + ||S||) 2 , 

\n\<N 

and letting N tend to infinity gives |a n + 6 n | 2 < oo. It also fol¬ 

lows that \\A + B\\ < \\A\\ + ||S||, which is the triangle inequality. The 
Cauchy-Schwarz inequality, which states that the sum a nb n con¬ 

verges absolutely and that \(A^B)\ < \\A\\ ||B||, can be deduced in the 
same way from its finite analogue. 

In the three examples C d , and £ 2 (Z), the vector spaces with their 
inner products and norms satisfy two important properties: 

(i) The inner product is strictly positive-definite, that is, ||X|| = 0 
implies X = 0. 

(ii) The vector space is complete, which by definition means that 
every Cauchy sequence in the norm converges to a limit in the 
vector space. 
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An inner product space with these two properties is called a Hilbert 
space. We see that and C d are examples of finite-dimensional Hilbert 
spaces, while ^ 2 (Z) is an example of an infinite-dimensional Hilbert space 
(see Exercises 1 and 2). If either of the conditions above fail, the space 
is called a pre-Hilbert space. 

We now give an important example of a pre-Hilbert space where both 
conditions (i) and (ii) fail. 

Example 2. Let TZ denote the set of complex-valued Riemann integrable 
functions on [0, 2tt] (or equivalently, integrable functions on the circle). 
This is a vector space over C. Addition is defined point wise by 

(f + g)(e ) 二 m + g(e)_ 

Naturally, multiplication by a scalar A G C is given by 

(A/)(0 卜 A./ ⑹. 

An inner product is defined on this vector space by 

(i) fi.o)W)de. 

The norm of / is then 

ii/ii 二(士/:_ 2 洲 

One needs to check that the analogue of the Cauchy-Schwarz and tri¬ 
angle inequalities hold in this example; that is, \(f,g)\ < ||/|| ||p|| and 
||/ + 沒 ||< ll/ll+ ||p||. While these facts can be obtained as consequences 
of the corresponding inequalities in the previous examples, the argument 
is a little elaborate and we prefer to proceed differently. 

We first observe that 2AB < (A 2 + B 2 ) for any two real numbers A 
and B. If we set A = 入 1//2 |/(0)| and B = X~ 1 / 2 \g(6)\ with A > 0, we get 

1/( 賴 |^( a _| 2 + a - 1 _| 2 ). 

We then integrate this in 6 to obtain 

1 r 27r _ 1 

1(/,5)1<^) 0 _l m\de< -(A||/f + 

Then, put A = ||^||/||/|| to get the Cauchy-Schwarz inequality. The tri¬ 
angle inequality is then a simple consequence, as we have seen above. 
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Of course, in our choice of A we must assume that ||/|| ^ 0 and ||^|| ^ 0, 
which leads us to the following observation. 

In TZ : condition (i) for a Hilbert space fails, since ||/|| =0 implies only 
that / vanishes at its points of continuity. This is not a very serious 
problem since in the appendix we show that an integrable function is 
continuous except for a “negligible” set, so that ||/|| =0 implies that / 
vanishes except on a set of “measure zero.” One can get around the 
difficulty that / is not identically zero by adopting the convention that 
such functions are actually the zero function, since for the purpose of 
integration, / behaves precisely like the zero function. 

A more essential difficulty is that the space TZ is not complete. One 
way to see this is to start with the function 


for 0 = 0 


m 


log(l/0) for 0 < 6 < 2n. 


Since / is not bounded, it does not belong to the space TZ. Moreover, 
the sequence of truncations f n defined by 



0 for 0 < 0 < 1/n, 
f(0) for 1/n < 6 <2 tt 


can easily be seen to form a Cauchy sequence in TZ (see Exercise 5). How¬ 
ever, this sequence cannot converge to an element in TZ, since that limit, 
if it existed, would have to be /; for another example, see Exercise 7. 

This and more complicated examples motivate the search for the com¬ 
pletion of TZ, the class of Riemann integrable functions on [0, 2tt]. The 
construction and identification of this completion, the Lebesgue class 
L 2 ([0,27r]), represents an important turning point in the development of 
analysis (somewhat akin to the much earlier completion of the rationals, 
that is, the passage from Q to M). A further discussion of these fun¬ 
damental ideas will be postponed until Book III, where we take up the 
Lebesgue theory of integration. 

We now turn to the proof of Theorem 1.1. 

1.2 Proof of mean-square convergence 

Consider the space TZ of integrable functions on the circle with inner 
product 










Ibookroot October 20, 2007 


1. Mean-square convergence of Fourier series 


77 


and norm ||/|| defined by 

i /* 27r 

ll/ll 2 = (/,/) = 5 /。 \m\ 2 de. 

With this notation, we must prove that \\f — 知 (/)|| ― • 0 as TV tends to 
infinity. 

For each integer n, let e n (6) = e tn0 ， and observe that the family {e n } ne z 

is orthonormal; that is, 


(^n? ^m) = 



if n = m 
if n 7 ^ m. 


Let / be an integrable function on the circle, and let a n denote its Fourier 
coefficients. An important observation is that these Fourier coefficients 
are represented by inner products of / with the elements in the orthonor¬ 
mal set {e n } nG z ： 


1 /*27T 

{f ， 6n) ^2n Jo ， de 二、 

In particular, 5 jv(/) = J]|n|<iv a n e n- Then the orthonormal property of 
the family {e n } and the fact that a n = (/ ， e n ) imply that the difference 
/ — E| n | <iV a n e n is orthogonal to e n for all \n\ < N. Therefore, we must 
have 

⑶ (/- E ^n^n) -L 〉: b n 6 n 

|n|<iV \n\<N 


for any complex numbers b n . We draw two conclusions from this fact. 
First, we can apply the Pythagorean theorem to the decomposition 

f = f _ 〉: + 〉: d n c n: 

|n|<AT \n\<N 


where we now choose 6 n = a n , to obtain 

ll/ll 2 = 11/ _ a ^ e n|| 2 + 

\n\<N 


〉 : a n G n | 
\n\<N 


Since the orthonormal property of the family {e n } n ^z implies that 

II a ne n || 2 = E |a n | 2 , 

|n|<iV \n\<N 
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we deduce that 

(3) ll/ll 2 = 11/- m/)ii 2 + E K. 

|n|<iV 

The second conclusion we may draw from (2) is the following simple 
lemma. 

Lemma 1.2 (Best approximation) If f is integrable on the circle with 
Fourier coefficients a n , then 

11/ - ^(/)ii < ii/ - ^2 Cnen \\ 

\n\<N 

for any complex numbers c n . Moreover, equality holds precisely when 
c n = ci n for all \n\ < N. 

Proof. This follows immediately by applying the Pythagorean theo¬ 
rem to 

/ — Cnen = / — *5W(/) + b n e n , 

|n|<AT \n\<N 

where b n = a n — c n . 

This lemma has a clear geometric interpretation. It says that the 
trigonometric polynomial of degree at most N which is closest to / in 
the norm || • || is the partial sum This geometric property of the 

partial sums is depicted in Figure 1, where the orthogonal projection of 
/in the plane spanned by {e_jv, … ，句 ， ...，〜} is simply *Sjv(/). 



We can now give the proof that || SV (/) — /II — 0 using the best ap¬ 
proximation lemma, as well as the important fact that trigonometric 
polynomials are dense in the space of continuous functions on the circle. 
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Suppose that / is continuous on the circle. Then, given e > 0, there 
exists (by Corollary 5.4 in Chapter 2) a trigonometric polynomial P, say 
of degree M, such that 

\f(9)-P(9)\ <e for all 6. 


In particular, taking squares and integrating this inequality yields 
11/ — P|| < e, and by the best approximation lemma we conclude that 

11/ — 5^(/)|| < e whenever N > M. 

This proves Theorem 1.1 when / is continuous. 

If / is merely integrable, we can no longer approximate / uniformly 
by trigonometric polynomials. Instead, we apply the approximation 
Lemma 3.2 in Chapter 2 and choose a continuous function g on the 
circle which satisfies 


sup \g(e)\ < sup |/(6>)| = B, 

^€[0,2tt] 6>G[0,2tt] 


and 


Then we get 


r 27T 


\f(e)-g(6)\d9<e 2 


r 27T 


Wf-9\\ 2 -^J o \m-g(e)\ 2 d 0 


1 广 

2 丌 X 

2B 广 
27T Jo 

< Ce 2 . 


< 


\f(e)-g(e)\\f(e)- g (0)\dd 

\f(e)- g (9)\de 


Now we may approximate ^ by a trigonometric polynomial P so that 
||p — P\\ < e. Then \\f — P\\ < C’e, and we may again conclude by ap¬ 
plying the best approximation lemma. This completes the proof that the 
partial sums of the Fourier series of / converge to / in the mean square 
norm || . ||. 

Note that this result and the relation (3) imply that if a n is the n th 
Fourier coefficient of an integrable function /, then the series Yj:=-oo \ a n \ 2 
converges, and in fact we have Parseval’s identity 

£ w 2 hi/ii 2 . 


n=—oo 
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This identity provides an important connection between the norms in 
the two vector spaces £ 2 (Z) and TZ. 

We now summarize the results of this section. 


Theorem 1.3 Let f be an integrable function on the circle with 
f 〜 a ne inG . Then we have: 

(i) Mean-square convergence of the Fourier series 


r 2n 


2tt 


|/(0) — 5iv(/)(^)| 2 d6 ^ 0 as N — oo. 


(ii) ParsevaVs identity 


E 


f>27T 


2tt, 


\m\ 2 de. 


Remark 1. If {e n } is any orthonormal family of functions on the 
circle, and a n = (/, e n ), then we may deduce from the relation (3) that 

E l«n| 2 < ll/f. 

n=—oo 

This is known as Bessel’s inequality. Equality holds (as in Parseval’s 
identity) precisely when the family {e n } is also a “basis,” in the sense 
that || 2|n|<iV a n^n - /|| ^ 0 as AT ^ oo. 

Remark 2. We may associate to every integrable function the se¬ 
quence {a n } formed by its Fourier coefficients. Parseval’s identity guar¬ 
antees that {a n } G £ 2 {Z). Since £ 2 {Z) is a Hilbert space, the failure of TZ 
to be complete, discussed earlier, may be understood as follows: there 
exist sequences {a n } ne z such that |a n | 2 < oo, yet no Riemann in¬ 

tegrable function F has n th Fourier coefficient equal to a n for all n. An 
example is given in Exercise 6 . 

Since the terms of a converging series tend to 0, we deduce from Par- 
seval’s identity or Bessel’s inequality the following result. 

Theorem 1.4 (Riemann-Lebesgue lemma) If f is integrable on the 
circle, then f(n) 0 as \n\ —^ oo. 

An equivalent reformulation of this proposition is that if / is integrable 
on [0,27 t], then 

2tt 

f{0) s\n{N0) dO ^ 0 as TV —^ oo 
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and 

2tt 

f(6) cos(N6) d6 ^ 0 as AT —> oo. 

To conclude this section, we give a more general version of the Parseval 
identity which we will use in the next chapter. 

Lemma 1.5 Suppose F and G are integrable on the circle with 

a ne ine and 〜 e —_ 

Then 

2tt _ oo 

F(e)G(e)de = E 

n=—oo 

Recall from the discussion in Example 1 that the series Yl^-oo a nb n 
converges absolutely. 

Proof. The proof follows from Parseval’s identity and the fact that 

(F, G)^ 1 - [||F + G|| 2 - ||F - Gf + i (\\F + iG\\ 2 - ||F - iG|| 2 )] 

which holds in every Hermitian inner product space. The verification of 
this fact is left to the reader. 

2 Return to pointwise convergence 

The mean-square convergence theorem does not provide further insight 
into the problem of pointwise convergence. Indeed, Theorem 1.1 by itself 
does not guarantee that the Fourier series converges for any 6. Exercise 3 
helps to explain this statement. However, if a function is differentiable 
at a point 0o, then its Fourier series converges at 9q. After proving this 
result, we give an example of a continuous function with diverging Fourier 
series at one point. These phenomena are indicative of the intricate 
nature of the problem of pointwise convergence in the theory of Fourier 
series. 




2.1 A local result 

Theorem 2.1 Let f be an integrable function on the circle which is dif¬ 
ferentiable at a point 9 q. Then 5at(/)(^o) ~^ f[0o) as N tends to infinity. 
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Proof. Define 


m 二 


m -t)- m) 

—m) 


if ^ 0 and |t| < 7r 

if t = 0. 


First, F is bounded near 0 since / is differentiable there. Second, for 
all small 6 the function F is integrable on [—7r, —5] U [5,7r] because / has 
this property and \t\ > S there. As a consequence of Proposition 1.4 in 
the appendix, the function F is integrable on all of [—7r, 7r]. We know 
that Sjsf(f)(6o) = (/ * Dn)(Gq), where is the Dirichlet kernel. Since 
去 / Djv = 1, we find that 


s N (f)(0o) - m)= 


2tt 

1 

2 ^ 

1 

2tt 



/( 汐 o - dt - f(0Q) 

[/(^o — t) — f(6o)]DN(t) dt 
F(t)tD]\[(t) dt. 


We recall that 

咖⑴ =^ W sin((iv+i/2)t) ， 

where the quotient sin (^/ 2 ) is continuous in the interval [—7r, 7r]. Since we 
can write 

sin((iV+ l/2)t) = sm(Nt) cos(t/2) + cos(Nt) sin(^/2), 

we can apply the Riemann-Lebesgue lemma to the Riemann integrable 
functions F(t)tcos(t/2) / sin(t/2) and F(t)t to finish the proof of the the¬ 
orem. 

Observe that the conclusion of the theorem still holds if we only assume 
that / satisfies a Lipschitz condition at 6^\ that is, 

\f(e)~ f(e 0 )\<M\e-e 0 \ 

for some M > 0 and all 9. This is the same as saying that / satisfies a 
Holder condition of order a = 1. 

A striking consequence of this theorem is the localization principle of 
Riemann. This result states that the convergence of SV(/)( 汐 o) depends 
only on the behavior of / near 6q. This is not clear at first, since forming 
the Fourier series requires integrating / over the whole circle. 
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Theorem 2.2 Suppose f and g are two integrable functions defined on 
the circle, and for some 9o there exists an open interval I containing 0o 
such that 

f(d) = g(d) for all d € I. 

Then SW(/)( 沒 o) — Sn{9){0q) 0 as N tends to infinity. 

Proof. The function / — ^ is 0 in /, so it is differentiable at 0q, and 
we may apply the previous theorem to conclude the proof. 


2.2 A continuous function with diverging Fourier series 


We now turn our attention to an example of a continuous periodic func¬ 
tion whose Fourier series diverges at a point. Thus, Theorem 2.1 fails 
if the differentiability assumption is replaced by the weaker assumption 
of continuity. Our counter-example shows that this hypothesis which 
had appeared plausible, is in fact false; moreover, its construction also 
illuminates an important principle of the theory. 

The principle that is involved here will be referred to as u symmetry- 
breaking.” 1 The symmetry that we have in mind is the symmetry be¬ 
tween the frequencies e in0 and e~ ine which appear in the Fourier expan¬ 
sion of a function. For example, the partial sum operator Sjv is defined 
in a way that reflects this symmetry. Also, the Dirichlet, Fejer, and 
Poisson kernels are symmetric in this sense. When we break the symme¬ 
try, that is, when we split the Fourier series X^^L-oo a n^ in6 into the two 
pieces ^ n >o a n^ in6 and J2 n <o a n^ ind y we introduce new and far-reaching 
phenomena. 

We give a simple example. Start with the sawtooth function / which is 
odd in 6 and which equals z(7r — 6) when 0 < 6 < tt . Then, by Exercise 8 
in Chapter 2, we know that 


⑷ 


m 〜 E 

n^O 


^in6 

n 


Consider now the result of breaking the symmetry and the resulting series 


n=—1 

E 


n=—oo 


^inO 

n 


Then, unlike (4), the above is no longer the Fourier series of a Riemann 
integrable function. Indeed, suppose it were the Fourier series of an 


x We have borrowed this terminology from physics, where it is used in a very different 
context. 
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integrable function, say /, where in particular / is bounded. Using the 
Abel means, we then have 


~ .°°, r n 

I 木 (/) ⑼ I = 

n=l U 


which tends to infinity as r tends to 1, because diverges. This 

gives the desired contradiction since 

\A r (f~m\T \m\p r (e)de<sup i/»i, 

^ J-7T G 


where P r (0) denotes the Poisson kernel discussed in the previous chapter. 


The sawtooth function is the object from which we will fashion our 
counter-example. We proceed as follows. For each > 1 we define the 
following two functions on [—7r,7r], 


In(0)= 


^.inO 


and Jn{0) = ^ - • 

^ — J 77 

-N<n<-1 


We contend that: 

(i) |/iv(0)| > clog TV. 

(ii) Jn{0) is uniformly bounded in N and 9. 

The first statement is a consequence of the fact that ^ 

logiV, which is easily established (see also Figure 2): 



=log N. 


To prove (ii), we argue in the same spirit as in the proof of Tauber’s 
theorem, which says that if the series c n is Abel summable to s and 
c n = o(l/n), then c n actually converges to s (see Exercise 14 in Chap¬ 
ter 2). In fact, the proof of Tauber’s theorem is quite similar to that of 
the lemma below. 


Lemma 2.3 Suppose that the Abel means A r = r n c n of the series 
c n are bounded as r tends to 1 (with r < 1). If c n = 0(l/n), then 
the partial sums Sn = ^2n=i c n are bounded. 
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n n + 1 

Figure 2. Comparing a sum with an integral 


Proof. Let r = 1 — 1/N and choose M so that n|c n | < M. We esti¬ 
mate the difference 


N 


as follows: 


Sn ~ — (c n — T n C n ) — T n C n 

n=l n=N-\-l 


N oo 

|^-^| <^|c n |(l-r n ) + ^ r n \c n \ 

n=l n=N-\-l 

N M °° 

<M^(l-r) + - r- 


n=N-\-l 


< MN(1 - 


2M, 


M 


N 1 — r 


where we have used the simple observation that 

1 — r n = (1 — r)(l + r + … + r n_1 ) < n(l — r). 

So we see that if M satisfies both \A r \ < M and n\c n \ < M, then |5^| < 
3M. 


We apply the lemma to the series 


E 

n^O 


^inO 

n 
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which is the Fourier series of the sawtooth function / used above. Here 
c n = e in6 /n + e~ ine /(—n) for n # 0, so clearly c n = 0(l/|n|). Finally, 
the Abel means of this series are A r (f)(6) = (/ * P r )(0). But / is bounded 
and P r is a good kernel, so *Sa/-(/)( 0) is uniformly bounded in N and 0, 
as was to be shown. 


We now come to the heart of the matter. Notice that fjsr and fpj 
are trigonometric polynomials of degree N (that is, they have non-zero 
Fourier coefficients only when |n| < N). From these, we form trigono¬ 
metric polynomials Pn and P/v, now of degrees 3N and 2N — 1, by 
displacing the frequencies of /n and /tv by 2N units. In other words, 
we define Pn(0) = e < 2N ) ❹ and Pn( 0) = e l ^ 2N ^ e / n(0). So while /at 
has non-vanishing Fourier coefficients when 0 < |n| < now the coef¬ 
ficients of Pn are non-vanishing for N < n < 3iV, n ^ 2N. Moreover, 
while n = 0 is the center of symmetry of /tv, now n = 2N is the center 
of symmetry of Pn• We next consider the partial sums Sm> 


Lemma 2.4 


\Pn ifM>3N ， 
Sm(Pn) = <Pn ifM = 2N, 
0 ifM < N. 


This is clear from what has been said above and from Figure 3. 


fN(e) 


-NON 

e i(2N)e f N (0) = P N (0) 

- 1 - 1 - 1 1 1 

0 iV 2iV 37V 

S2N(e i(2N)e fN)(0) 

= e i( ~ 2N 、 ”肩 

0 TV 2iV 37V 

Figure 3. Breaking symmetry in Lemma 2.4 


The effect is that when M = 27V, the operator Sm breaks the symme¬ 
try of Pn, but in the other cases covered in the lemma, the action of Sm 
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is relatively benign, since then the outcome is either Pn or 0. 

Finally, we need to find a convergent series of positive terms J2 a k and 
a sequence of integers {Nk} which increases rapidly enough so that: 

(i) Nk+i > 37Vfc, 

(ii) log Nk —^ oo as fc —> oo. 

We choose (for example) = l/k 2 and = 3 2 ^ which are easily seen 
to satisfy the above criteria. 

Finally, we can write down our desired function. It is 

oo 

/( 0 ) = a k P N k (0). 
k=l 

Due to the uniform boundedness of the Pjv (recall that \Pn(6)\ = |/at(^)|), 
the series above converges uniformly to a continuous periodic function. 
However, by our lemma we get 

|^2iv m (/)(0)| > cam log N m + 0(1) ^ oo as m ^ oo. 

^ - ^ - ( f ----- 0 - ^ -^― 

Nk-i SNk-i Nk 3Nk -/Vfc+i 3_/Vfc+i 

2N k 


Figure 4. Symmetry broken in the middle interval (iV^, SN^) 


Indeed, the terms that correspond to Nk with k < m or k > m con¬ 
tribute 0(1) or 0, respectively (because the Pjv 5 s are uniformly bounded), 
while the term that corresponds to Nm is in absolute value greater than 
cam log Nm because \Pn(0)\ = |/iv(^)| > clog A/". So the partial sums of 
the Fourier series of / at 0 are not bounded, and we are done since this 
proves the divergence of the Fourier series of / at 0 = 0. To produce a 
function whose series diverges at any other preassigned 0 = 6o, it suffices 
to consider the function f(9 — 6 q). 


3 Exercises 


1. Show that the first two examples of inner product spaces, namely M. d and C d , 
are complete. 
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[Hint: Every Cauchy sequence in M has a limit.] 

2. Prove that the vector space £ 2 {Z) is complete. 

[Hint: Suppose = {cik,n}nez with /c = 1, 2, ... is a Cauchy sequence. Show 
that for each n, {dk,n}^=i is a Cauchy sequence of complex numbers, therefore 
it converges to a limit, say b n . By taking partial sums of \\A^ — A^> || and letting 
k f oo, show that ||4 — B\\ 0 as A; —>• oo, where B = [ 6 _i, 61 ， •••)• 

Finally, prove that B G i 2 (Z).] 

3. Construct a sequence of integrable functions {fk} on [0, 2n] such that 

i r 2lv 

lim — / \f k (0)\ 2 d0 = 0 

fc—oo Z7T Jq 

but limfc_, 00 fk{0) fails to exist for any 0. 

[Hint: Choose a sequence of intervals Ik C [0, 2n] whose lengths tend to 0, and 
so that each point belongs to infinitely many of them; then let fk = Xi k -] 

4. Recall the vector space 1Z of integrable functions, with its inner product and 
norm 

/ 1 严 \ 1/2 

"/" = (“ 1/( 參） . 

(a) Show that there exist non-zero integrable functions / for which ||/|| = 0. 

(b) However, show that if / G 7^ with ||/|| = 0, then f(x) = 0 whenever / is 
continuous at x. 

(c) Conversely, show that if / G 7^. vanishes at all of its points of continuity, 
then ll/ll = 0. 


5. Let 


m = 


0 

log(l/0) 


for 沒 = 0 
for 0 < ^ < 27r, 


and define a sequence of functions in 7Z by 


fn{0 )= 



for 0 < ^ < 1/n 
ioi 1/n < 6 < 2 tt. 


Prove that {fn}^=i is a Cauchy sequence in 1Z. However, / does not belong to 

n. 
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[Hint: Show that (log 0) 2 d0^0H0<a<b and 6 ^ 0, by using the fact that 
the derivative of 沒 (log 沒 ) 2 — 26 log ^ + 2^ is equal to (log 沒 ) 2 .] 

6. Consider the sequence {afc}^ = _ QO defined by 

_ f 1/k if k>l 
afc = \ 0 if fc < 0. 

Note that {a^} G ^ 2 (Z), but that no Riemann integrable function has A: th Fourier 
coefficient equal to for all k. 


7. Show that the trigonometric series 


E 

n>2 


-- sin nx 

logn 


converges for every x, yet it is not the Fourier series of a Riemann integrable 
function. 

The same is true for S1 ^ x for 0 < a < 1, but the case 1/2 < a < 1 is more 
difficult. See Problem 1. 


8. Exercise 6 in Chapter 2 dealt with the sums 

i 00 i 

^ ] and ^ —• 

^ n 2 ^ n 2 

n odd >1 n=l 


Similar sums can be derived using the methods of this chapter. 

(a) Let / be the function defined on [—7r, tt] by f(0) = | 沒 |. Use Parseval’s 
identity to find the sums of the following two series: 

CXD OO 

V- - and 

乙 (2n + l) 4 乙 n 4 

n=0 v , n=l 

In fact, they are 7r 4 /96 and 丌 4 /90, respectively. 

(b) Consider the 27r-periodic odd function defined on [0, n] by f(0) = 0(tt — 0). 
Show that 


y. 1 _ 7T 6 

^ (2n + l) 6 = 960 


E 


n 6 


945 


Remark. The general expression when k is even for 1/ 几 in terms of n k 

is given in Problem 4. However, finding a formula for the sum 1/ 几 3 , or 

more generally l/n k with k odd, is a famous unresolved question. 
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9. Show that for a not an integer, the Fourier series of 


e i(7T-x)o 


on [0,27r] is given by 


oo 

E 


n + a 


Apply Parseval’s formula to show that 

1 


E 


(n + a) 2 (sin7TQ；) 2 ' 


10. Consider the example of a vibrating string which we analyzed in Chapter 1. 
The displacement u(x,t) of the string at time t satisfies the wave equation 

1 d 2 u d 2 u 9 

= c =T/p - 

The string is subject to the initial conditions 

w(x,0) = f(x) and 瓦 (a ；， 0) = g(x), 

where we assume that / G C 1 and g is continuous. We define the total energy 
of the string by 


m = l p l (w) dx+ l T 

The first term corresponds to the “kinetic energy” of the string (in analogy with 
(l/2)mv 2 , the kinetic energy of a particle of mass m and velocity v), and the 
second term corresponds to its “potential energy.” 

Show that the total energy of the string is conserved, in the sense that E(t) 
is constant. Therefore, 



E(t) = 五 (0)= 



g{x) 2 dx + 



f r {x) 2 dx. 


11. The inequalities of Wirtinger and Poincare establish a relationship between 
the norm of a function and that of its derivative. 
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(a) 


If / is T-periodic, continuous, and piecewise C 1 with J Q T f(t) dt = 0, show 
that 



f(t)\ 2 dt< 





with equality if and only if f{t) = Asm(2nt/T) Bcos{27rt/T). 
[Hint: Apply Parseval’s identity.] 

(b) If / is as above and g is just C 1 and T-periodic, prove that 


f(t)g(t) dt 


< 


4 丌 2 


\f(t)\ 2 dt / |ff’ ⑴ | 2 dt 


(c) For any compact interval [a, b] and any continuously differentiable function 
f with f[a ) 二 f(b) — 0, show that 

f b dt 仝&二 ^ f\nt)\ 2 dt. 

J a ^ J a 

Discuss the case of equality, and prove that the constant (b — a) 2 /t: 2 can¬ 
not be improved. [Hint: Extend / to be odd with respect to a and periodic 
of period T = 2(b — a) so that its integral over an interval of length T is 
0. Apply part a) to get the inequality, and conclude that equality holds if 
and only if f(t ) 二 v4sin(7r ㈢)] • 


/ sm X 7T 

12. Prove that / - ax — 

Jo x 2 

[Hint: Start with the fact that the integral of Dn(0) equals 2 丌 , and note that 
the difference (1/sin(^/2)) — 2/9 is continuous on [—7r, tt]. Apply the Riemann- 
Lebesgue lemma.] 

13. Suppose that / is periodic and of class C k . Show that 

/(n) = o(l/M fc ), 

that is, |n| fc /(n) goes to 0 as \n\ oo. This is an improvement over Exercise 10 
in Chapter 2. 

[Hint: Use the Riemann-Lebesgue lemma.] 

14. Prove that the Fourier series of a continuously differentiable function / on 
the circle is absolutely convergent. 

[Hint: Use the Cauchy-Schwarz inequality and ParsevaPs identity for /’.] 


15. Let / be 27r-periodic and Riemann integrable on [—7r, 7r]. 
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(a) Show that 


/ ⑹ = 一 


2?r 


f(x + 7r/n)e~ inx dx 


hence 

f(n) = y f [/(*) - /(* + n/n)]e—dx. 

J — 7T 


(b) Now assume that / satisfies a Holder condition of order a, namely 

\f{x + h)-f{x)\<C\h\ a 

for some 0 < a < 1, some (7 > 0, and all x, h. Use part a) to show that 

fin) = 0(l/\n\ a ). 


(c) Prove that the above result cannot be improved by showing that the func¬ 
tion 

oo 

f(x) = J22~ ka e i2kx , 

k=0 

where 0 < a < 1, satisfies 

\f(x + h)-f(x)\<C\h\ a , 
and /(TV) = 1 /N a whenever N 二 2 k . 

[Hint: For (c), break up the sum as follows f(x h) — f(x) = J^ 2 fc <i/|/i| 
X^ 2 fc >i/|/i|. To estimate the first sum use the fact that 11 — e l0 \ < |^| whenever 6 
is small. To estimate the second sum, use the obvious inequality |e 找 —e iy \ < 2.] 

16. Let / be a 27r-periodic function which satisfies a Lipschitz condition with 
constant K\ that is, 

1/ ㈤- f(y)\ < K\x-y\ for all x, y. 

This is simply the Holder condition with a = 1, so by the previous exercise, we 
see that f(n) = 0(l/\n\). Since the harmonic series Y]l/n diverges, we cannot 
say anything (yet) about the absolute convergence of the Fourier series of /. The 
outline below actually proves that the Fourier series of / converges absolutely 
and uniformly. 
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(a) For every positive h we define gnipc) — f(x h) — f(x — h). Prove that 

oo 

g h {x)\ 2 dx = ^2 4|sin nh\ 2 \f(n)\ 2 , 

n= —oo 

and show that 

oo 

I sinn/i| 2 |/(n)| 2 < K 2 h 2 . 



(b) Let p be a positive integer. By choosing h = 7r/2 p+1 , show that 


E i/wi 2 ^ 

2P- 1 <|n|<2P 


K 2 7T 2 

22p+l • 


(c) Estimate ^2 2 p- 1 <\n\<2p l/( n )l， an d conclude that the Fourier series of / 
converges absolutely, hence uniformly. [Hint: Use the Cauchy-Schwarz 
inequality to estimate the sum.] 

(d) In fact, modify the argument slightly to prove Bernstein’s theorem: If / 
satisfies a Holder condition of order a > 1/2, then the Fourier series of / 
converges absolutely. 


17. If / is a bounded monotonic function on [—7r, 7r], then 

/( n ) = 0(l/|n|). 

[Hint: One may assume that / is increasing, and say |/| < M. First check that 
the Fourier coefficients of the characteristic function of [a, b] satisfy 0(l/|n|). 
Now show that a sum of the form 

N 

a ^X[a k ,a k+1 ]{ x ) 

fc=l 

with —tv 二 ai < 0/2 <•••<_ < ajv+i — tt and —M <o ： i< ••- <ajv<M has 
Fourier coefficients that are 0(l/|n|) uniformly in N. Summing by parts one gets 
a telescopic sum y^(ak+i — o^k) which can be bounded by 2M. Now approximate 
f by functions of the above type.] 

18. Here are a few things we have learned about the decay of Fourier coefficients: 

(a) if / is of class C fc , then f(n) — o(l/\n\ k ); 

(b) if / is Lipschitz, then /(n) = 0(l/|n|); 
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(c) if / is monotonic, then f(n) — 0(l/|n|); 

(d) if / is satisfies a Holder condition with exponent a where 0 < a < 1, then 
fin) = 0(l/|n| a ); 

(e) if / is merely Riemann integrable, then X] l/( n )| 2 < oo and therefore 
/(n) = o ⑴. 

Nevertheless, show that the Fourier coefficients of a continuous function can 
tend to 0 arbitrarily slowly by proving that for every sequence of nonnegative 
real numbers {e n } converging to 0, there exists a continuous function / such 
that |/(n)| > e n for infinitely many values of n. 

[Hint: Choose a subsequence {e nfc } so that e n fc < oo.] 


19. Give another proof that the sum X^o<|n|<AT e mx /n is uniformly bounded in 
N and x G [—7r, 7r] by using the fact that 


2i 


E 

0<|n|<AT 



=E 

n=l 


sin nx 
n 




(D N (t) - l)dt, 


where Dn is the Dirichlet kernel. Now use the fact that J 0 °° dt < oo which 
was proved in Exercise 12. 


20. Let f(x) denote the sawtooth function defined by f(x) = (7r — x)/2 on the 
interval (0, 2tt) with /(0) = 0 and extended by periodicity to all of R. The 
Fourier series of / is 


|n|#0 




and / has a jump discontinuity at the origin with 

/( 0+ )= 吾， /(0~) = -^ and hence /(0 + ) - /(0~) = tt. 
Show that 


rt , /*、， 、 丌 f sin^ . 7r 

0<^7iv SN{f){x) ~2 = J 0 ~ dt ~r 


which is roughly 9% of the jump 7r. This result is a manifestation of Gibbs’s 
phenomenon which states that near a jump discontinuity, the Fourier series of a 
function overshoots (or undershoots) it by approximately 9% of the jump. 

[Hint: Use the expression for Sn(I) given in Exercise 19.] 
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4 Problems 


1. For each 0 < a < 1 the series 


oo 

E 

n=l 


sm nx 
n a 


converges for every x but is not the Fourier series of a Riemann integrable func¬ 
tion. 

(a) If the conjugate Dirichlet kernel is defined by 


D n {x)= sign{x) e inx 

\n\<N 


! 1 if n > 0 

0 if n = 0 

—1 if n < 0, 


then show that 


Dn{x )= 


cos(x/2) — cos((_/V + l/2)x) 
sin(a:/2) 


and 



Dn(x)\ dx < c log N. 


(b) As a result, if / is Riemann integrable, then 

(/*^)(0) = O(logiV). 


(c) In the present case, this leads to 


N i 


which is a contradiction. 


2. An important fact we have proved is that the family {e inx } n ^z is orthonormal 
in 1Z and it is also complete, in the sense that the Fourier series of / converges 
to / in the norm. In this exercise, we consider another family possessing these 
same properties. 

On [—1,1] define 

d n 

L n (x) = — l) n , n = 0,1, 2, — 

Then L n is a polynomial of degree n which is called the n th Legendre poly¬ 
nomial. 
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(a) Show that if / is indefinitely differentiable on [—1,1], then 



l) n f {n \x)dx. 


In particular, show that L n is orthogonal to x m whenever m < n. Hence 
{L n }^L 0 is an orthogonal family. 

(b) Show that 



[Hint: First, note that ||-^ n || 2 — (—l) n (2n)! — l) n dx. Write 

(x 2 — l) n = (: r— l) n (x + l) n and integrate by parts n times to calculate 
this last integral.] 

(c) Prove that any polynomial of degree n that is orthogonal to 1, x, a: 2 ,..., x n ~ 
is a constant multiple of L n . 

(d) Let C n — L n /||L n ||, which are the normalized Legendre polynomials. Prove 
that {Cn} is the family obtained by applying the “Gram-Schmidt process” 
to {1, x, … ，...}, and conclude that every Riemann integrable function 
/ on [—1,1] has a Legendre expansion 




which converges to / in the mean-square sense. 

3. Let o ； be a complex number not equal to an integer. 

(a) Calculate the Fourier series of the 27r-periodic function defined on [—7r, n] 
by f(x) = cos(aa;). 

(b) Prove the following formulas due to Euler: 



2a 2 2a tan(mr) 


For all it G C — 7rZ, 


cot u —— 
u 


oo 

+ 2E 


u 2 — n 2 7T 2 
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-t— 7 — r = 1 + 2a 2 飞 - 2 

sm(a7r) n z — a z 
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(d) For all 0 < a < 1, show that 


t ^- 1 1 丌 

- dt 二 —-— 

t + 1 sin(mr) 

[Hint: Split the integral as and change variables t = 1/u in the 

second integral. Now both integrals are of the form 




rf~ 


l +1 


dt ， 


0 < 7 < 1, 


which one can show is equal to d • Use part (c) to conclude the 

proof.] 


4. In this problem, we find the formula for the sum of the series 

oo 

n=l 

where k is any even integer. These sums are expressed in terms of the Bernoulli 
numbers; the related Bernoulli polynomials are discussed in the next problem. 

Define the Bernoulli numbers B n by the formula 



n=0 


(a) Show that Bq = 1, B\ — —1/2, — 1/6, Bs = 0, = —1/30, and 

= 0. 

(b) Show that for n > 1 we have 



k=0 


(c) By writing 

-^ = i- z +y B ^z n 

e z — 1 2 n\ 

71=2 
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show that B n = 0 if n is odd and > 1. Also prove that 

_°°_ o2n d 

zcotz = i + Ys-ir 1 ^r^. 

(d) The zeta function is defined by 


C(«) = X! for all s > 1. 


Deduce from the result in (c), and the expression for the cotangent func¬ 
tion obtained in the previous problem, that 


/ -J rrr-Zm 


(e) Conclude that 


2C(2m) = (-1) 


7TI+1 


(27T) 2m 

(2m)! 


B2r) 


5. Define the Bernoulli polynomials B n (x) by the formula 

ze xz = B n {x) 〜 n 
e z — 1 人/ n \ z 

n=0 

(a) The functions B n (x) are polynomials in x and 

B n (x) = J2C n )B k x n ~ k . 

k=0 ^ ' 

Show that B 0 (:c) = 1 ， Bi{x) — x — 1/2, B 2 (x) — x 2 — x -\- 1/6, and 
Bs(x) — x z — |x 2 + \x. 

(b) If n > 1, then 

B n (x + 1) - B n (x) = nx n ~ x , 

and if n > 2, then 

-^n(O) = B n {l) = B n . 

(c) Define S m (n) = l m + 2 m + . •. + (n — l) m . Show that 


(771 + 1)*^771(72)= 召 m+1 (打） _ 召 771+1- 
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(d) Prove that the Bernoulli polynomials are the only polynomials that satisfy 

(i) _B 0 (x) = 1 ， 

(ii) B^x) — nB n -\(x) for n > 1, 

(iii) Jq 1 B n {x) dx = 0 for n > 1, and show that from (b) one obtains 

x+l 

B n (t) dt — x n . 



(e) Calculate the Fourier series of Bi(x) to conclude that for 0 < x < 1 we 
have 


B\{x) — x — \ j2 — 


—1 sin(27rA;x) 

7T k 

k=l 


Integrate and conclude that 


B 2 n (x) = (-l ) n+1 


2(2n)! 

(2n) 2n 


oo 

E 

k=l 


cos(27rA:x) 

^.2n 


B 2n+1 (x)=(-ir +i 


2(2n+ 1)! 
(27r) 2ra + 1 


OO 

E 


sin(27rA:x) 

~ & 2n+l~ 


Finally, show that for 0 < ^ < 1, 


B n {x )= 


n\ 

(2ni) n 


E 

fc 一 o 


^2nikx 

k n 


We observe that the Bernoulli polynomials are, up to normalization, successive 
integrals of the sawtooth function. 
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Some Applications of Fourier 
Series 


Fourier series and analogous expansions intervene very 
naturally in the general theory of curves and surfaces. 
In effect, this theory, conceived from the point of view 
of analysis, deals obviously with the study of arbitrary 
functions. I was thus led to use Fourier series in sev¬ 
eral questions of geometry, and I have obtained in this 
direction a number of results which will be presented 
in this work. One notes that my considerations form 
only a beginning of a principal series of researches, 
which would without doubt give many new results. 

A. Hurwitz, 1902 


In the previous chapters we introduced some basic facts about Fourier 
analysis, motivated by problems that arose in physics. The motion of a 
string and the diffusion of heat were two instances that led naturally to 
the expansion of a function in terms of a Fourier series. We propose next 
to give the reader a flavor of the broader impact of Fourier analysis, and 
illustrate how these ideas reach out to other areas of mathematics. In 
particular, consider the following three problems: 

I. Among all simple closed curves of length £ in the plane M 2 , which 
one encloses the largest area? 

II. Given an irrational number 7 , what can be said about the distri¬ 
bution of the fractional parts of the sequence of numbers 717 , for 
n = 1,2,3,...? 

III. Does there exist a continuous function that is nowhere differen¬ 
tiable? 


The first problem is clearly geometric in nature, and at first sight, would 
seem to have little to do with Fourier series. The second question lies on 
the border between number theory and the study of dynamical systems, 
and gives us the simplest example of the idea of “ergodicity.” The third 
problem, while analytic in nature, resisted many attempts before the 
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solution was finally discovered. It is remarkable that all three questions 
can be resolved quite simply and directly by the use of Fourier series. 

In the last section of this chapter, we return to a problem that provided 
our initial motivation. We consider the time-dependent heat equation 
on the circle. Here our investigation will lead us to the important but 
enigmatic heat kernel for the circle. However, the mysteries surrounding 
its basic properties will not be fully understood until we can apply the 
Poisson summation formula, which we will do in the next chapter. 

1 The isoperimetric inequality 

Let r denote a closed curve in the plane which does not intersect itself. 
Also, let £ denote the length of T, and A the area of the bounded region 
in M 2 enclosed by T. The problem now is to determine for a given £ the 
curve r which maximizes A (if any such curve exists). 



Figure 1. The isoperimetric problem 


A little experimentation and reflection suggests that the solution should 
be a circle. This conclusion can be reached by the following heuristic con¬ 
siderations. The curve can be thought of as a closed piece of string lying 
flat on a table. If the region enclosed by the string is not convex (for ex¬ 
ample) , one can deform part of the string and increase the area enclosed 
by it. Also, playing with some simple examples, one can convince oneself 
that the “flatter” the curve is in some portion, the less efficient it is in 
enclosing area. Therefore we want to maximize the “roundness” of the 
curve at each point. 

Although the circle is the correct guess, making the above ideas precise 
is a difficult matter. 

The key idea in the solution we give to the isoperimetric problem con¬ 
sists of an application of Parseval’s identity for Fourier series. However, 
before we can attempt a solution to this problem, we must define the 
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notion of a simple closed curve, its length, and what we mean by the 
area of the region enclosed by it. 

Curves, length and area 

A parametrized curve 7 is a mapping 

7 : [a, b] —> M 2 . 

The image of 7 is a set of points in the plane which we call a curve and 
denote by T. The curve T is simple if it does not intersect itself, and 
closed if its two end-points coincide. In terms of the parametrization 
above, these two conditions translate into 7 (^ 1 ) 7 ^ 7 (^ 2 ) unless si = a 
and 52 = 6, in which case 7 (a) = 7 ( 6 ). We may extend 7 to a periodic 
function on M of period b _ a, and think of 7 as a function on the circle. 
We also always impose some smoothness on our curves by assuming that 
7 is of class C 1 , and that its derivative 7 ’ satisfies 7 ^ 5 ) 7 ^ 0. Altogether, 
these conditions guarantee that T has a well-defined tangent at each 
point, which varies continuously as the point on the curve varies. More¬ 
over, the parametrization 7 induces an orientation on T as the parameter 
s travels from a to 6 . 

Any C l bijective mapping s : [c, d] —>• [a, b] gives rise to another 
parametrization of T by the formula 

rj(t) ^ 7 (s(i)). 

Clearly, the conditions that T be closed and simple are independent of 
the chosen parametrization. Also, we say that the two parametrizations 
7 and r] are equivalent if s ’ ⑴ > 0 for all this means that 77 and 7 
induce the same orientation on the curve T. If, however, s’(t) < 0, then 
7] reverses the orientation. 

If T is parametrized by 7 ( 5 ) = (x(s),y(s)), then the length of the 
curve r is defined by 

t = 1 | 7 ， (s)|d 5 = f (x r {s) 2 -\- y r {s) 2 ) 1 ^ 2 ds. 

J a J a 

The length of T is a notion intrinsic to the curve, and does not depend 
on its parametrization. To see that this is indeed the case, suppose that 
7 (s(t)) = r](t). Then, the change of variables formula and the chain rule 
imply that 

f h'(s)\ds^ f |Y(s ⑼ I |s ’ ⑴ fn'(t)\dt, 

J a J c J c 








Ibookroot October 20, 2007 


1. The isoperimetric inequality 


103 


as desired. 

In the proof of the theorem below, we shall use a special type of 
parametrization for T. We say that 7 is a parametrization by arc- 
length if | 7 ’(s)| = 1 for all s. This means that 7 ( 5 ) travels at a constant 
speed, and as a consequence, the length of T is precisely b — a. Therefore, 
after a possible additional translation, a parametrization by arc-length 
will be defined on [0 ,£]. Any curve admits a parametrization by arc- 
length (Exercise 1). 

We now turn to the isoperimetric problem. 

The attempt to give a precise formulation of the area A of the region 
enclosed by a simple closed curve T raises a number of tricky questions. 
In a variety of simple situations, it is evident that the area is given by 
the following familiar formula of the calculus: 




see, for example, Exercise 3. Thus in formulating our result we shall 
adopt the easy expedient of taking (1) as our definition of area. This 
device allows us to give a quick and neat proof of the isoperimetric in¬ 
equality. A listing of issues this simplification leaves unresolved can be 
found after the proof of the theorem. 

Statement and proof of the isoperimetric inequality 

Theorem 1.1 Suppose that T is a simple closed curve in R 2 of length 
i, and let A denote the area of the region enclosed by this curve. Then 



with equality if and only if T is a circle. 

The first observation is that we can rescale the problem. This means 
that we can change the units of measurement by a factor of 5 > 0 as 
follows. Consider the mapping of the plane M 2 to itself, which sends the 
point (x,y) to (5x,5y). A look at the formula defining the length of a 
curve shows that if T is of length then its image under this mapping 
has length 5£. So this operation magnifies or contracts lengths by a 
factor of 5 depending on whether <5 > 1 or 5 < 1. Similarly, we see that 
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the mapping magnifies (or contracts) areas by a factor of S 2 . By taking 
6 = 2 丌 / 彳， we see that it suffices to prove that if £ = 2n then < 7 r, with 
equality only if T is a circle. 

Let 7 : [0, 2tt] —> R 2 with 7 ( 5 ) = (x(s) : y(s)) be a parametrization by 
arc-length of the curve T, that is, x f (s) 2 + y f (s) 2 = 1 for all s G [0,2 丌 ]. 
This implies that 

1 r 27r 

( 2 ) ^J o (x\s) 2 + y\ S ) 2 )ds^l. 

Since the curve is closed, the functions x(s) and y(s) are 27r-periodic, so 
we may consider their Fourier series 

x(s) ~ ^ a n e ins and y(s) 〜 ^ b n e ins . 

Then, as we remarked in the later part of Section 2 of Chapter 2, we 
have 


x\s) ~ a n ine ins and y’(s ) 〜 ^ b n ine ins . 

Parseval’s identity applied to (2) gives 


00 

(3) E \n\ 2 (|a n | 2 + \b n \ 2 ) - 1. 

n=—oo 


We now apply the bilinear form of ParsevaPs identity (Lemma 1.5, Chap¬ 
ter 3) to the integral defining A. Since x(s) and y(s) are real-valued, we 
have a n = and b n = 6 _ n , so we find that 




x(s)y'(s) - y(s)x'(s) ds 



n =—00 



We observe next that 


⑷ |a n 6 n — b n a^\ < 2 \a n \ \b n \ < |o n | 2 + \b n \ 2 

and since |n| < |n| 2 , we may use (3) to get 


A<-k ^ |n | 2 (|a „| 2 + \b n \ 2 ) 

n=—oo 

< 丌， 
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as desired. 

When ^4 = 7r, we see from the above argument that 

x(s) = a-ie~ ls + a 0 + d\e ls and y(s) = b-ie~ ls + 6。 + 

because In| < \n\ 2 as soon as \n\ > 2. We know that x(s) and y(s) are 
real-valued, so a_i = aT and b-\ = b\. The identity (3) implies that 
2 (|ai| 2 + |6i| 2 ) = 1, and since we have equality in (4) we must have 
\ai\ = |6i| = 1/2. We write 

ai = ^ e ia and 6i = ^ e l(3 . 

The fact that 1 = 2\a\bi — a\bi\ implies that | sin(a — /3)| = 1, hence 
a — (3 = kir/2 where k is an odd integer. From this we find that 

x(s) = ao + cos(a + s) and y(s) = 6o sin(o ； + s), 

where the sign in y(s) depends on the parity of (fc — 1)/2. In any case, 
we see that r is a circle, for which the case of equality obviously holds, 
and the proof of the theorem is complete. 


The solution given above (due to Hurwitz in 1901) is indeed very ele¬ 
gant, but clearly leaves some important issues unanswered. We list these 
as follows. Suppose r is a simple closed curve. 

(i) How is the “region enclosed by r” defined? 

(ii) What is the geometric definition of the “area” of this region? Does 
this definition accord with (1)? 

(iii) Can these results be extended to the most general class of sim¬ 
ple closed curves relevant to the problem — those curves which are 
“rectifiable” —— that is, those to which we can ascribe a finite length? 

It turns out that the clarifications of the problems raised are connected 
to a number of other significant ideas in analysis. We shall return to 
these questions in succeeding books of this series. 

2 Weyl’s equidistribution theorem 

We now apply ideas coming from Fourier series to a problem dealing 
with properties of irrational numbers. We begin with a brief discussion 
of congruences, a concept needed to understand our main theorem. 
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The reals modulo the integers 

If x is a real number, we let [x] denote the greatest integer less than 
or equal to x and call the quantity [x] the integer part of x. The 
fractional part of x is then defined by (x) = x — [x\. In particular, 
(x) G [0,1) for every x G M. For example, the integer and fractional parts 
of 2.7 are 2 and 0.7, respectively, while the integer and fractional parts 
of —3.4 are —4 and 0.6, respectively. 

We may define a relation on K. by saying that the two numbers x and 
y are equivalent, or congruent, \i x — y ET,. We then write 

x = y mod Z or x = y mod 1 . 

This means that we identify two real numbers if they differ by an integer. 
Observe that any real number x is congruent to a unique number in 
[0,1) which is precisely (x)^ the fractional part of x. In effect, reducing 
a real number modulo Z means looking only at its fractional part and 
disregarding its integer part. 

Now start with a real number 7 _ 0 and look at the sequence 
7 , 27 , 37 , …. An intriguing question is to ask what happens to this 
sequence if we reduce it modulo Z, that is, if we look at the sequence of 
fractional parts 

( 7 )) 〈 2 7 〉，〈 37〉，•… 

Here are some simple observations: 

(i) If 7 is rational, then only finitely many numbers appearing in ( 727 ) 
are distinct. 

(ii) If 7 is irrational, then the numbers ( 77 - 7 ) are all distinct. 

Indeed, for part (i), note that if 7 = p/q, the first q terms in the sequence 
are 


{p/q), 〈 2p/g 〉， … ， ((q- i)p/q), (qp/q) ^ 0. 

The sequence then begins to repeat itself, since 

((?+ 1)f/?) = (1 +y/g) = {p/q), 

and so on. However, see Exercise 6 for a more refined result. 

Also, for part (ii) assume that not all numbers are distinct. We there¬ 
fore have ( 77 - 17 ) = ( 77 - 27 ) f° r some n\ 7 ^ then G Z, hence 7 

is rational, a contradiction. 
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In fact, it can be shown that if 7 is irrational, then ( 77 - 7 ) is dense in the 
interval [0,1), a result originally proved by Kronecker. In other words, 
the sequence (nj) hits every sub-interval of [0,1) (and hence it does so 
infinitely many times). We will obtain this fact as a corollary of a deeper 
theorem dealing with the uniform distribution of the sequence ( 717 ). 

A sequence of numbers ^ 1 ,^ 2 , • • •, • • • in [0,1) is said to be equidis- 

tributed if for every interval (a, b) C [0,1), 

r #{1 <n<N:^ n e (a,b)} 

lim -—- = b — a 

iV—>-oo N 


where 勢 A denotes the cardinality of the finite set A. This means that 
for large N, the proportion of numbers in (a, b) with n < N is equal to 
the ratio of the length of the interval (a, b) to the length of the interval 
[0,1). In other words, the sequence sweeps out the whole interval 
evenly, and every sub-interval gets its fair share. Clearly, the ordering of 
the sequence is very important, as the next two examples illustrate. 

Example 1. The sequence 


0 , 



12 12 3 12 

3 , ° ，！，！，！， 0 , p g ， 


appears to be equidistributed since it passes over the interval [0,1) very 
evenly. Of course this is not a proof, and the reader is invited to give 
one. For a somewhat related example, see Exercise 8 with a = 1/2. 

Example 2. Let {r n }^ =1 be any enumeration of the rationals in [0,1). 
Then the sequence defined by 


J r n /2 if n is even, 
\ 0 if n is odd, 


is not equidistributed since “half” of the sequence is at 0. Nevertheless, 
this sequence is obviously dense. 

We now arrive at the main theorem of this section. 


Theorem 2.1 If 7 is irrational, then the sequence of fractional parts 
( 7 ), ( 27 ), ( 37 ),... is equidistributed in [ 0 , 1 ). 

In particular, ( 77 , 7 ) is dense in [0,1), and we get Kronecker’s theo¬ 
rem as a corollary. In Figure 2 we illustrate the set of points ( 7 ), ( 27 ), 
( 37 ) ，…， (Nj) for three different values of N when 7 = y/2. 
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N = 

=10 

) 

N 二 

= 30 

) 

N 二 

= 80 


0 1 

Figure 2. The sequence ( 7 ), ( 27 ), ( 37 ), … ， (Nj) when 7 = \f2 


Fix (a, 6 ) C [0,1) and let X(a,b){ x ) denote the characteristic function 
of the interval (a, 6 ), that is, the function equal to 1 in (a, b) and 0 in 
[0,1) — (a, b). We may extend this function to M by periodicity (pe¬ 
riod 1), and still denote this extension by X(a,b){ x ) - Then, as a conse¬ 
quence of the definitions, we find that 


N 

#{1 <n<N: ( 717 ) € (a, 6 )} = ^ X(a,b)(nj), 


and the theorem can be reformulated as the statement that 

i N r 1 

-^7 y^X(a, 6 )(^ 7 ) ^ / X(a,b)(x) dx, as TV 4 00. 

^ n=l Jo 

This step removes the difficulty of working with fractional parts and 
reduces the number theory to analysis. 

The heart of the matter lies in the following result. 

Lemma 2.2 If f is continuous and periodic of period 1, and 7 is irra¬ 
tional, then 


1 N f 1 

■T 7 f[x)dx as N ^ 00 . 

八 n=l Jo 

The proof of the lemma is divided into three steps. 

Step 1. We first check the validity of the limit in the case when / 
is one of the exponentials 1, e 2nlx , ... , e 27rtkx , .... If /= 1, the limit 
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surely holds. If / = e 27Vlkx with k ★ 0, then the integral is 0. Since 7 is 
irrational, we have e 2?rz/c7 7 ^ 1 , therefore 

1 N „ 27 rifc 7 i _ ^nikN^ 

丄 V f(n^y) = __ 

N •/ V /y N 1 _ e 2iriky ’ 
n=l 


which goes to 0 as TV 00 . 

Step 2. It is clear that if / and g satisfy the lemma, then so does 
Af + Bg for any A, B G C. Therefore, the first step implies that the 
lemma is true for all trigonometric polynomials. 

Step 3. Let e > 0. If / is any continuous periodic function of period 1 , 
choose a trigonometric polynomial P so that |/(x) — P(x)\ < e/3 

(this is possible by Corollary 5.4 in Chapter 2). Then, by step 1, for all 
large N we have 


N 


N 


P(71Tf) — P(x) dx 


< e/3. 


Therefore 

1 N r 1 

^ /( n 7 ) - / f(x) dx 


N 


< ^ 1/(^7) -^(^7)1 + 


N 


N 


P(ji/y) - P(x) dx 


⑷ -f(x)\dx 


< e, 


and the lemma is proved. 


Now we can finish the proof of the theorem. Choose two continuous 
periodic functions / 广 and f~ of period 1 which approximate X(a,b){x) 
on [ 0 , 1 ) from above and below; both / e + and f~ are bounded by 1 and 
agree with X(a,b){ x ) except in intervals of total length 2e (see Figure 3). 
In particular, f~(x) < X(a,b)(^) < f e + ⑻, and 

b — a — 2e < f f~ (x) dx and f (x) dx < b — a 2e. 

Jo Jo 

社 S N 二 * En=i X(a,b)(n7), then we get 

AT N 

H fr(rn) <S n <—J2 

n=l n=l 
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ft 



Figure 3. Approximations of X(a,b)( x ) 


Therefore 

b — a — 2e < lim inf Sn and lim sup Sn < b — a 2e. 

iV-KX) N—oo 

Since this is true for every e > 0, the limit limjv^oo Sn exists and must 
equal b — a. This completes the proof of the equidistribution theorem. 

This theorem has the following consequence. 

Corollary 2.3 The conclusion of Lemma 2.2 holds for every function 
f which is Riemann integrable in [0,1], and periodic of period 1. 

Proof. Assume / is real-valued, and consider a partition of the 
interval [0,1], say 0 = Xo < = 1- Next, define fu{ x ) = 

f{y) ^ [ X 3-^ X o) and fU X ) 二 f{y) &r X € 

Then clearly !l< f < fu and 

[fL(x)dx < [ f(x) dx < [ fu(x) dx. 

Jo Jo Jo 

Moreover, by making the partition sufficiently fine we can guarantee that 
for a given e > 0, 

[fu(x) dx- f f L {x) dx < e. 

Jo Jo 

However, 

i N 

^ /i( n 7) ^ / h(x)dx 

n. = 1 0 












Ibookroot October 20, 2007 


2. Weyl’s equidistribution theorem 111 

by the theorem, because each Jl is a finite linear combination of charac¬ 
teristic functions of intervals; similarly we have 

i N r 1 

fu(x) dx. 

n=l Jo 

From these two assertions we can conclude the proof of the corollary by 
using the previous approximation argument. 

There is an interesting interpretation of the lemma and its corollary, 
in terms of a simple dynamical system. In this example, the underlying 
space is the circle parametrized by the angle 6. We also consider a 
mapping of this space to itself: here, we choose a rotation p of the circle 
by the angle 2 丌 7 , that is, the transformation p : 0 i—> 0 + 2ix^. 

We want next to consider how this space, with its underlying action 
p, evolves in time. In other words, we wish to consider the iterates of p, 
namely p, p 2 , p 3 , … ， p n where 

p n = p o p o ... o p \ 0 6 -\- 27rn7, 

and where we think of the action p n taking place at the time t = n. 

To each Riemann integrable function / on the circle, we can also asso¬ 
ciate the corresponding effects of the rotation and obtain a sequence 
of functions 


with f[p n {0)) = f(6 + 27rn7). In this special context, the ergodicity of 
this system is then the statement that the “time average” 

1 N 

lim ^ V f(p n (0)) 

n=l 

exists for each 6 and equals the “space average” 

1 f 27T 

whenever 7 is irrational. In fact, this assertion is merely a rephrasing of 
Corollary 2.3, once we make the change of variables 6 = 2tvx. 

Returning to the problem of equidistributed sequences, we observe that 
the proof of Theorem 2.1 gives the following characterization. 
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WeyPs criterion. A sequence of real numbers 《 1,(2 ... ^ 

[0,1) is equidistributed if and only if for all integers A: 乂 0 one 
has 

1 N . 

— ： e 2?rlfc ^ n —^ 0 , as TV — oo. 

N ^ 

n=l 

One direction of this theorem was in effect proved above, and the con¬ 
verse can be found in Exercise 7. In particular, we find that to understand 
the equidistributive properties of a sequence ^ n , it suffices to estimate 
the size of the corresponding “exponential sum” ^2^=1 e 27rlk ^ n . For ex¬ 
ample, it can be shown using WeyPs criterion that the sequence (n 2r y) 
is equidistributed whenever 7 is irrational. This and other examples can 
be found in Exercises 8 , and 9; also Problems 2, and 3. 

As a last remark, we mention a nice geometric interpretation of the 
distribution properties of ( 717 ). Suppose that the sides of a square are 
reflecting mirrors and that a ray of light leaves a point inside the square. 
What kind of path will the light trace out? 



Figure 4. Reflection of a ray of light in a square 


To solve this problem, the main idea is to consider the grid of the 
plane formed by successively reflecting the initial square across its sides. 
With an appropriate choice of axis, the path traced by the light in the 
square corresponds to the straight line P + (t, ^t) in the plane. As a 
result, the reader may observe that the path will be either closed and 
periodic, or it will be dense in the square. The first of these situations 
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will happen if and only if the slope 7 of the initial direction of the light 
(determined with respect to one of the sides of the square) is rational. 
In the second situation, when 7 is irrational, the density follows from 
Kronecker’s theorem. What stronger conclusion does one get from the 
equidistribution theorem? 


3 A continuous but nowhere differentiable function 

There are many obvious examples of continuous functions that are not 
differentiable at one point, say f(x) = |x|. It is almost as easy to con¬ 
struct a continuous function that is not differentiable at any given finite 
set of points, or even at appropriate sets containing countably many 
points. A more subtle problem is whether there exists a continuous 
function that is nowhere differentiable. In 1861, Riemann guessed that 
the function defined by 


(5) 


丑 ㈤ ： E 

n=l 


sin(n 2 x) 

n 2 


was nowhere differentiable. He was led to consider this function because 
of its close connection to the theta function which will be introduced in 
Chapter 5. Riemann never gave a proof, but mentioned this example in 
one of his lectures. This triggered the interest of Weierstrass who, in an 
attempt to find a proof, came across the first example of a continuous but 
nowhere differentiable function. Say 0 < b < 1 and a is an integer > 1. 
In 1872 he proved that if ab > 1 + 37r/2, then the function 

00 

W(x) = ^ b n cos(a n x) 


is nowhere differentiable. 

But the story is not complete without a final word about Riemann’s 
original function. In 1916 Hardy showed that R is not differentiable at 
all irrational multiples of 7r, and also at certain rational multiples of 7r. 
However, it was not until much later, in 1969, that Gerver completely 
settled the problem, first by proving that the function R is actually 
differentiable at all the rational multiples of tt of the form np/q with p 
and q odd integers, and then by showing that R is not differentiable in 
all of the remaining cases. 

In this section, we prove the following theorem. 
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Theorem 3.1 If 0 < a < 1, then the function 

oo 

/« ㈦ 二 /(aO = [2-"V 2 、 

n=0 

is continuous but nowhere differentiable. 

The continuity is clear because of the absolute convergence of the se¬ 
ries. The crucial property of / which we need is that it has many van¬ 
ishing Fourier coefficients. A Fourier series that skips many terms, like 
the one given above, or like W(x), is called a lacunary Fourier series. 

The proof of the theorem is really the story of three methods of sum¬ 
ming a Fourier series. First, there is the ordinary convergence in terms 
of the partial sums = ^ * Djsf. Next, there is Cesaro summabil- 

ity (JN(g) = g 冬 Fn, with Fn the Fejer kernel. A third method, clearly 
connected with the second, involves the delayed means defined by 


Ajv(^) = 2a 2iV ⑷一 cr N (g)- 


Hence A]^(g) = g * [2F 2 n — ^V]- These methods can best be visualized 
as in Figure 5. 

Suppose g(x) ~ a n e inx . Then: 

• Sn arises by multiplying the term a n e inx by 1 if |n| < N, and 0 if 
|n| > N. 

• ctn arises by multiplying a n e inx by 1 — \n\/N for |n| < N and 0 for 
|n| > N. 

• Aat arises by multiplying a n e tnx by 1 if |n| < N, by 2(1 — |n|/(27V)) 
for iV < |n| < 2N, and 0 for \n\ > 2N. 

For example, note that 


a N {g){T) = + … + 6 , j V _i(g)(x) 


N-l 


N 


N 




Akx 


EE 

£=0 \k\<£ 

J2 {N-\n\)a n e ir ' 


|n|<AT 

\n\<N X 
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-NON 
Partial sums 

Sn( 9 )(x) = S|n|<iV a n^ nX 


-N 


0 

Cesaro means 


W ⑷ ㈤ =E 


|n|<iV 


1 


N 


a n e iri 



o 

Delayed means 



A N (g)(x) = 2a 2 N(g)(x) - a N (g)(x) 


Figure 5. Three summation methods 
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The proof of the other assertion is similar. 

The delayed means have two important features. On the one hand, 
their properties are closely related to the (good) features of the Cesaro 
means. On the other hand, for series that have lacunary properties like 
those of /, the delayed means are essentially equal to the partial sums. 
In particular, note that for our function f = f a 

⑹ S N (f)^A N ， (f), 

where N f is the largest integer of the form 2 k with N f < N. This is clear 
by examining Figure 5 and the definition of /. 

We turn to the proof of the theorem proper and argue by contradiction; 
that is, we assume that /’($o) exists for some : tq. 

Lemma 3.2 Let g be any continuous function that is differentiable at 
xq. Then, the Cesaro means satisfy aNigYi^o) = 0(log N), therefore 

△iv(5)’Oo) = 0(log N). 

Proof. First we have 

广 7T 广 7T 

aN(g)'(x 0 )= I F' n (xo - t)g(t) dt 二 I F' N {t)g{x Q - t) dt, 

J —TT J —7T 

where Fn is the Fejer kernel. Since is periodic, we have J^ 7T F^ f (t)dt = 0 
and this implies that 

crN(g)'(xo) = / F' N {t)[g(x Q - t) - g(x 0 )] dt. 

From the assumption that g is differentiable at xq we get 

|ctjv(sO'Oo)| <c [ \F^(t)\\t\dt. 

J —TV 

Now observe that F f N satisfies the two estimates 
14 ⑷ I S AN 2 and 

For the first inequality, recall that Fn is a trigonometric polynomial 
of degree N whose coefficients are bounded by 1. Therefore, F f N is a 
trigonometric polynomial of degree N whose coefficients are no bigger 
than N. Hence \F\t)\ < (2N+1)N < AN 2 . 
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For the second inequality, we recall that 


Fjsfii)= 


1 sin 2 (iVt/2) 
N sin 2 ("2) 


Differentiating this expression, we get two terms: 


sin(iVt/2) cos(7Vt/2) 1 cos(t/2) sin 2 (iVt/2) 
sin 2 (t/2) N sin 3 (t/2) 


If we then use the facts that |sin(iVt/2)| < CN\t\ and | sin(t/2)| > c\t\ 
(for \t\ < 7r), we get the desired estimates for F f N (t). 

Using all of these estimates we find that 


kjv(fiO’Oo)| <c [ IA ⑴ I ⑷ dt + c [ dt 

<CA f 畀 + CAN [ dt 

J\t\>l/N \t\ J|t|<l/AT 

= 0(logAT) + 0(l) 

= 0(logAT). 


The proof of the lemma is complete once we invoke the definition of 

Lemma 3.3 If 2N = 2 n , then 

△ 2 iv(/) — Aiv(/) 二 2~ na e i2Ux . 

This follows from our previous observation (6) because A2 at(/)= 
5W(/) and Aiv(/) = SW(/). 

Now, by the first lemma we have 

Am(fy(x 0 ) — A N (fy(x 0 ) 二 O(logiV), 


and the second lemma also implies 

\A 2N (fY(x 0 ) - A N (fY(x 0 )\ - > cN l ~ a . 

This is the desired contradiction since N 1_a grows faster than log N. 

A few additional remarks about our function f a (x) = 2 _na e z2 、 

are in order. 
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This function is complex-valued as opposed to the examples R and W 
above, and so the nowhere differentiability of f a does not imply the same 
property for its real and imaginary parts. However, a small modification 
of our proof shows that, in fact, the real part of / a , 

oo 

J22~ na cos2 n x, 

n=0 

as well as its imaginary part, are both nowhere differentiable. To see 
this, observe first that by the same proof, Lemma 3.2 has the following 
generalization: if y is a continuous function which is differentiable at xo, 
then 


^n(9Y(^o h) = 0(log N) whenever \h\ < c/N. 

We then proceed with F{x) = 2 _na cos 2 n x, noting as above that 

△2#( 卩 ) —Aat(-F) = 2~ na cos 2 n x\ as a result, assuming that F is differ¬ 
entiable at $o, we get that 

|2 n ( 1-a ) sin(2 n (;r 0 + ft))| = O(logiV) 

when 2N = 2 n , and \h\ < c/N. To get a contradiction, we need only 
choose h so that | sin(2 n (a；o + h))\ = 1; this is accomplished by setting 
5 equal to the distance from 2 u xq to the nearest number of the form 
(fc + l/2)7r, fc G Z (so 5 < 7r/2), and taking h = 士 5/2 n . 

Clearly, when a > 1 the function f a is continuously differentiable since 
the series can be differentiated term by term. Finally, the nowhere dif¬ 
ferentiability we have proved for a < 1 actually extends to a = 1 by a 
suitable refinement of the argument (see Problem 8 in Chapter 5). In 
fact, using these more elaborate methods one can also show that the 
Weierstrass function W is nowhere differentiable if a6 > 1. 


4 The heat equation on the circle 

As a final illustration, we return to the original problem of heat diffusion 
considered by Fourier. 

Suppose we are given an initial temperature distribution at 亡 = 0 on a 
ring and that we are asked to describe the temperature at points on the 
ring at times t > 0. 

The ring is modeled by the unit circle. A point on this circle is de¬ 
scribed by its angle 9 = 27rx, where the variable x lies between 0 and 1. 
If u(x, t) denotes the temperature at time t of a point described by the 
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angle then considerations similar to the ones given in Chapter 1 show 
that u satisfies the differential equation 

〜 du d 2 u 

⑺ nr 

The constant c is a positive physical constant which depends on the 
material of which the ring is made (see Section 2.1 in Chapter 1). After 
rescaling the time variable, we may assume that c = 1. If / is our initial 
data, we impose the condition 

u(x ， 0) = f(x). 

To solve the problem, we separate variables and look for special solutions 
of the form 


u(x,t) = A(x)B(t). 

Then inserting this expression for u into the heat equation we get 

B\t) _ A n {x) 

B{t) A(x) 

Both sides are therefore constant, say equal to A. Since A must be 
periodic of period 1, we see that the only possibility is A = —47r 2 n 2 , 
where n G Z. Then A is a linear combination of the exponentials e 2?rtnx 
and e - 2nvnx , and B(t) is a multiple of e~ 4n2n2t . By superposing these 
solutions, we are led to 

oo 

(8) u(x,t)^ a n e~ in2nH e 2ninx , 

n=—oo 

where, setting 亡 = 0, we see that {a n } are the Fourier coefficients of /. 

Note that when f is Riemann integrable, the coefficients a n are 
bounded, and since the factor e~ 4n 71 1 tends to zero extremely fast, the 
series defining u converges. In fact, in this case, u is twice differentiable 
and solves equation (7). 

The natural question with regard to the boundary condition is the 
following: do we have u(x,t) —> f(x) as t tends to 0, and in what sense? 
A simple application of the Parseval identity shows that this limit holds 
in the mean square sense (Exercise 11). For a better understanding of 
the properties of our solution (8), we write it as 


u(x,t) = {f*H t )(x) 
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where Ht is the heat kernel for the circle, given by 


⑼ 


H t (x) 


E 


-Atv 2 n 2 1 2 tv inx 


and where the convolution for functions with period 1 is defined by 


{f *g){x) 



y)g(y)dy. 


An analogy between the heat kernel and the Poisson kernel (of Chapter 2) 
is given in Exercise 12. However, unlike in the case of the Poisson kernel, 
there is no elementary formula for the heat kernel. Nevertheless, it turns 
out that it is a good kernel (in the sense of Chapter 2). The proof is 
not obvious and requires the use of the celebrated Poisson summation 
formula, which will be taken up in Chapter 5. As a corollary, we will 
also find that H t is everywhere positive, a fact that is also not obvious 
from its defining expression (9). We can, however, give the following 
heuristic argument for the positivity of H t . Suppose that we begin with 
an initial temperature distribution / which is everywhere < 0. Then it 
is physically reasonable to expect u(x, t) < 0 for all t since heat travels 
from hot to cold. Now 


u(x,t) 



— y)H t {y) dy. 


If Ht is negative for some xo, then we may choose / < 0 supported near 
xo, and this would imply t) > 0, which is a contradiction. 


5 Exercises 


1. Let 7 : [a, b] R 2 be a parametrization for the closed curve T. 

(a) Prove that 7 is a parametrization by arc-length if and only if the length 
of the curve from 7(a) to 7(s) is precisely s — a, that is, 


f |7 ’⑷ I dtt — s — a. 
J a 


(b) Prove that any curve T admits a parametrization by arc-length. [Hint: If 
rj is any parametrization, let h(s) 二 J: |77 ， (t)| dt and consider 7 = 77 o /i -1 .] 


2. Suppose 7 : [a, b] ^ M 2 is a parametrization for a closed curve T, with 

i(t) = k ⑴， y ⑴). 
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2 


fb fb 

(x[s)y’ （ s) — y(5)x’(s)) ds — I x(s)y’ （ s) ds : 
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y(s)x r (s) ds. 


(b) Define the reverse parametrization of 7 by 7 _ : [a, b] —^ M 2 with 
7 _ (t) = 7(6 a — t). The image of is precisely T, except that the 
points 7 - (t) and ^(t) travel in opposite directions. Thus 7 _ “reverses” 
the orientation of the curve. Prove that 

/ (x dy — y dx) 二 — (xdy — y dx). 

J - 

In particular, we may assume (after a possible change in orientation) that 
(x(s)y\s) - y(s)x\s)) ds 二 x(s)y’ （ s)ds. 


3. Suppose r is a curve in the plane, and that there exists a set of coordinates 
x and y so that the x-axis divides the curve into the union of the graph of 
two continuous functions y = f(x) and y = g(x) for 0 < a: < 1, and with f(x) > 
g(x) (see Figure 6). Let O denote the region between the graphs of these two 
functions: 

Q = {(x,y) : 0 < x < 1 and g(x) <y < 



With the familiar interpretation that the integral J h(x) dx gives the area 
under the graph of the function h, we see that the area of O is f(x) dx — 
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Jq 1 9{ x ) dx. Show that this definition coincides with the area formula A given in 
the text, that is, 


/ f(x)dx- / g(x) dx : 

Iq Jo 



= A. 


Also, note that if the orientation of the curve is chosen so that Q “lies to the 
left” of r, then the above formula holds without the absolute value signs. 

This formula generalizes to any set that can be written as a finite union of 
domains like Q above. 


4. Observe that with the definition of £ and A given in the text, the isoperimetric 
inequality continues to hold (with the same proof) even when T is not simple. 

Show that this stronger version of the isoperimetric inequality is equivalent 
to Wirtinger’s inequality, which says that if / is 27r-periodic, of class C 1 , and 
satisfies J Q 27r f(t) dt = 0 , then 



f(t)\ 2 dt < 





with equality if and only if f(t ) 二 A sin t B cost (Exercise 11, Chapter 3). 

[Hint: In one direction, note that if the length of the curve is and 7 is an 
appropriate arc-length parametrization, then 


p2n /*27r 

2 (tt -A)= [x f (s) + y(s)} 2 ds-\- (y\s) 2 - y(s) 2 ) ds. 

Jo Jo 

A change of coordinates will guarantee J Q 27r y(s) ds 二 0. For the other direction, 
start with a real-valued / satisfying all the hypotheses of Wlrtinger’s inequality, 
and construct g, 27r-periodic and so that the term in brackets above vanishes.] 


5. Prove that the sequence { 7 n }^ =1 , where 7 n is the fractional part of 



is not equidistributed in [ 0 , 1 ]. 

[Hint: Show that U n = ( 1+ 2 ^ ) + ( 1- 2 ^ ) is the solution of the difference 
equation U r +\ — U r -\- U r -i with Uq — 2 and U\ — 1 . The U n satisfy the same 
difference equation as the Fibonacci numbers.] 


6 . Let 0 — p/q be a rational number where p and q are relatively prime inte¬ 
gers (that is, 6 is in lowest form). We assume without loss of generality that 
q > 0. Define a sequence of numbers in [0,1) by = (nO) where (•) denotes the 
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fractional part. Show that the sequence {^i, ^ 2 ? • • •} is equidistributed on the 
points of the form 

o, l/q, 2/g, (q - l)/q. 

In fact, prove that for any 0 < a < g, one has 

#{n : 1 < n < iV, (n6) = a/q} 1 , l\ 

- N - = ; +0 UJ. 

[Hint: For each integer A: > 0, there exists a unique integer n with kq < n < (k -\- 
l)q and so that (nO) = a/q. Why can one assume A: = 0? Prove the existence 
of n by using the fact 1 that if p and q are relatively prime, there exist integers 
x,y such that xp yq — 1. Next, divide N by q with remainder, that is, write 
N — lq-\-r where 0 < •£ and 0 < r < q. Establish the inequalities 

£ < #{n : 1 < n < iV, (nO) = ci/q} < ^ + 1.] 

7. Prove the second part of Weyl’s criterion: if a sequence of numbers fi, 心， … 
in [0,1) is equidistributed, then for all /c G Z — {0} 

i N 

—V e 2nik ^ - >0 as TV — oo. 

N ^ 

71=1 

[Hint: It suffices to show that 秦 /(D — Jq f( x ) dx for all continuous /. 

Prove this first when / is the characteristic function of an interval.] 

8. Show that for any a ^ 0, and a with 0 < cr < 1, the sequence (an a ) is equidis¬ 
tributed in [0,1). 

[Hint: Prove that J2n=i = 0(N a ) + O^N 1 ^) if 6 / 0.] In fact, note the 

following 



9. In contrast with the result in Exercise 8, prove that〈a logn〉is not equidis¬ 
tributed for any a. 

[Hint: Compare the sum e 27Vlblogn with the corresponding integral.] 

10 . Suppose that / is a periodic function on R of period 1, and {^ n } is a sequence 
which is equidistributed in [0,1). Prove that: 


1 The elementary results in arithmetic used in this exercise can be found at the begin¬ 


ning of Chapter 8. 












Ibookroot October 20, 2007 


124 Chapter 4. SOME APPLICATIONS OF FOURIER SERIES 

(a) If / is continuous and satisfies f(x) dx = 0, then 




f(x + ^ n ) = 0 uniformly in x. 


[Hint: Establish this result first for trigonometric polynomials.] 

(b) If / is merely integrable on [0,1] and satisfies f(x) dx = 0, then 


lim 

N ― ^oo 


N 


N 




dx = 0. 


11. Show that if u(x,t) = (/ * H t )(x) where H t is the heat kernel, and / is 
Riemann integrable, then 

u(x, t) — f{pc)\ 2 dx ^ 0 as t — 0. 



12 . A change of variables in (8) leads to the solution 

u{9, t) = J2 = (/ * h T ){0) 

of the equation 

du d 2 u . , ^ n ^ 

—-=— - with 0 < 0 <2tt and r > 0, 

or o0 z 

with boundary condition u(6^ 0) = f(0) 〜 Y] a n e ine . Here h r (6 )= 

Yl^L-oo e -n2r e m0 . This version of the heat kernel on [0, 2 丌 ] is the analogue 
of the Poisson kernel, which can be written as P r {9 ) 二 Yl^L-oo e~^ r e ine with 
r = e~ T (and so 0 < r < 1 corresponds to r > 0). 

13. The fact that the kernel H t (x) is a good kernel, hence u(x : t) f(x) at 
each point of continuity of /, is not easy to prove. This will be shown in the 
next chapter. However, one can prove directly that H t {x) is “peaked” at x = 0 
as t ^ 0 in the following sense: 

(a) Show that \H t (x)\ 2 dx is of the order of magnitude of t~ x l 2 as t ^ 0. 

More precisely, prove that t 1 / 2 \H t {x)\ 2 dx converges to a non-zero 

limit as t ^ 0. 

(b) Prove that / 上 () 2 x 2 \H t {x)\ 2 dx = O^t 1 ^ 2 ) as t ^ 0. 
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[Hint: For (a) compare the sum 2 二 e~ CVj2t with the integral e~ cx2t dx 
where c > 0. For (b) use x 2 < C(sin7ra:) 2 for —1/2 < x < 1/2, and apply the 
mean value theorem to e~ cx £ .l 


6 Problems 

1. * This problem explores another relationship between the geometry of a curve 
and Fourier series. The diameter of a closed curve r parametrized by 
7 (t) = (x{t),y(t)) on [—7r,7r] is defined by 

sup \P -Q\= sup | 7 ⑹ 一 7 ( 亡 2 )|- 

P, £i, £2G[-7r,7r] 

If a n is the n th Fourier coefficient of ^/(t) = x{t) + iy(t) and i denotes the length 
of r, then 

(a) 2 |a n | < d for all n ^ 0. 

(b) £ < 7 rd, whenever T is convex. 

Property (a) follows from the fact that 2a n = ^ / 二 [7( 尤 ）— 7 (尤 + tt/ n)]e~ irit dt. 

The equality £ = ird is satisfied when r is a circle, but surprisingly, this is 
not the only case. In fact, one finds that l = rrd is equivalent to 2|ai| = d. We 
re-parametrize 7 so that for each t in [—7r, 7r] the tangent to the curve makes an 
angle t with the 2 /-axis. Then, if ai = 1 we have 

= ie“(l + r ⑼， 

where r is a real-valued function which satisfies r(t) + r(t + 7r) = 0, and 
|r(t)| < 1. Figure 7 (a) shows the curve obtained by setting r(t) — cos 5t. Also, 
Figure 7 (b) consists of the curve where r(t) — h(3t), with h(s) = — 1 if — 7 r < 
s < 0 and h(s) = 1 if 0 < s < 7 r. This curve (which is only piecewise of class C 1 ) 
is known as the Reuleaux triangle and is the classical example of a convex curve 
of constant width which is not a circle. 

2. * Here we present an estimate of Weyl which leads to some interesting results. 

(a) Let Sn = ^2n=i e 2n "( n ). Show that for H < N, one has 

H N-h 

\S N \ 2 <c - EE e 2ni(f(n-\-h)-f(n)) 
h=0 n=l 

for some constant c > 0 independent of N, H, and /. 

(b) Use this estimate to show that the sequence (n 2 ^f) is equidistributed in 
[ 0 , 1 ) whenever 7 is irrational. 
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Figure 7. Some curves with maximal length for a given diameter 


(c) More generally, show that if {^ n } is a sequence of real numbers so that 
for all positive integers h the difference (Cn-\-h — D is equidistributed in 
[0,1), then (^ n ) is also equidistributed in [0,1). 


(d) Suppose that P(x) — c n x n 
where at least one of ci,.. 
equidistributed in [0,1). 


+ ••• + c。is a polynomial with real coefficients, 
.,c n is irrational. Then the sequence (P(n)) is 


[Hint: For (a), let a n = e 27r */( n ) when 1 < n < TV and 0 otherwise. Then write 
H 0>n 二 Ef =1 E„ a n _|_fc and apply the Cauchy-Schwarz inequality. For (b), 
note that (n + / i ) 2 7 — n 2 ^ = 2n/ry + / i 2 7, and use the fact that for each integer 
h, the sequence {2nh^f) is equidistributed. Finally, to prove (d), assume first that 
P(x) = Q{x) + ci$ + Co where c\ is irrational, and estimate the exponential sum 
e 2nikp ( n ) m Then, argue by induction on the highest degree term which has 


an irrational coefficient, and use part (c).] 

3.* If a > 0 is not an integer and a ^ 0, then (an a ) is equidistributed in [0,1). 
See also Exercise 8. 


4. An elementary construction of a continuous but nowhere differentiable func¬ 
tion is obtained by “piling up singularities,” as follows. 

On [—1,1] consider the function 


(p(x) — \x 


and extend p to M by requiring it to be periodic of period 2. Clearly, (p is 
continuous on M and |(/?(x)| < 1 for all x so the function / defined by 



V?(4 n x) 


is continuous on R. 
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(a) Fix xq G M. For every positive integer m, let = 士！ 4 _m where the 

sign is chosen so that no integer lies in between 4 m xo and 4 m (xo + <5 m ). 
Consider the quotient 

_ p(4 n Oo + D) — (p(^ n x 0 ) 


Prove that if n > m, then 7 n = 0, and for 0 < n < m one has |7 n | ^ 4 n 
with | 7m | = 4-. 

(b) From the above observations prove the estimate 


f(x 0 + Sm) - f(x 0 ) 
Sm 


>^(3 m + l), 


and conclude that / is not differentiable at xq. 


5. Let / be a Riemann integrable function on the interval [—7r, 7r]. We define 
the generalized delayed means of the Fourier series of / by 

Sn H - 1 - Sn-\-k-i 

a N,K = - ^ - . 

Note that in particular 

「 o，jv = ctjv, criv,i = Sn and ctn,n — A tv, 
where An are the specific delayed means used in Section 3. 

(a) Show that 

<Jn,k = ({N + K)cjn+k — Ngn ), 

and 

w = s N + Y, f 1 - />) e ㈣ 

From this last expression for <jn,k conclude that 

Wn,k - Sm\ < ^2 l/(^)l 


for all iV < M < TV + K. 
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(b) Use one of the above formulas and Fej6r’s theorem to show that with 
N — kn and K 二 n, then 


akn,n(f)(0) — f(0) as n — oo 


whenever / is continuous at 0, and also 




/(#)+/ ⑺一) 

2 


as n ^ oo 


at a jump discontinuity (refer to the preceding chapters and their exer¬ 
cises for the appropriate definitions and results). In the case when / is 
continuous on [—7r, 7r], show that (Jkn,n{f) f uniformly as n ^ oo. 

(c) Using part (a), show that if f(y) — 0{l/\u\) and kn <m < (k l)n, we 
get 

(j 

\(Tkn n — •S'ml < — for some constant C > 0. 
k 


(d) Suppose that /(z/) = 0{l/\v\). Prove that if / is continuous at 0 then 


SnU)( 0) f(6) as JV ^ oo, 


and if / has a jump discontinuity at 6 then 


s N (fm — 


/( 护 ） +/(n 

2 


as N ^ oo. 


Also, show that if / is continuous on [—7r, 7r], then 5 at(/) / uniformly. 

(e) The above arguments show that if ^2 c n is Cesaro summable to s and c n = 
0(l/n), then Y2 c n converges to s. This is a weak version of Littlewood’s 
theorem (Problem 3, Chapter 2). 


6 . Dirichlefs theorem states that the Fourier series of a real continuous peri¬ 
odic function / which has only a finite number of relative maxima and minima 
converges everywhere to / (and uniformly). 

Prove this theorem by showing that such a function satisfies f(n) = 0(l/\n\). 
[Hint: Argue as in Exercise 17, Chapter 3; then use conclusion (d) in Problem 5 
above.] 
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The Fourier Transform on R 


The theory of Fourier series and integrals has always 
had major difficulties and necessitated a large math¬ 
ematical apparatus in dealing with questions of con¬ 
vergence. It engendered the development of methods 
of summation, although these did not lead to a com¬ 
pletely satisfactory solution of the problem. … For the 
Fourier transform, the introduction of distributions 
(hence the space S) is inevitable either in an explicit 
or hidden form.... As a result one may obtain all that 
is desired from the point of view of the continuity and 
inversion of the Fourier transform. 


L. Schwartz, 1950 


The theory of Fourier series applies to functions on the circle, or equiv¬ 
alently, periodic functions on M. In this chapter, we develop an analogous 
theory for functions on the entire real line which are non-periodic. The 
functions we consider will be suitably “small” at infinity. There are sev¬ 
eral ways of defining an appropriate notion of “smallness,” but it will 
nevertheless be vital to assume some sort of vanishing at infinity. 

On the one hand, recall that the Fourier series of a periodic function 
associates a sequence of numbers, namely the Fourier coefficients, to 
that function; on the other hand, given a suitable function / on M, the 
analogous object associated to / will in fact be another function / on M 
which is called the Fourier transform of /. Since the Fourier transform 
of a function on M is again a function on R, one can observe a symmetry 
between a function and its Fourier transform, whose analogue is not as 
apparent in the setting of Fourier series. 


Roughly speaking, the Fourier transform is a continuous version of the 
Fourier coefficients. Recall that the Fourier coefficients a n of a function 
/ defined on the circle are given by 
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and then in the appropriate sense we have 

( 2 ) f(x ) 二 a ， inx • 

n=—oo 

Here we have replaced 6 by 27rx, as we have frequently done previously. 

Now, consider the following analogy where we replace all of the discrete 
symbols (such as integers and sums) by their continuous counterparts 
(such as real numbers and integrals). In other words, given a function / 
on all of R, we define its Fourier transform by changing the domain of 
integration from the circle to all of M, and by replacing n G Z by ^ G K. 
in (1), that is, by setting 

(3) - [°° f(x)e~ 2 ^dx. 

J —OO 

We push our analogy further, and consider the following continuous ver¬ 
sion of (2): replacing the sum by an integral, and a n by /(0, leads to 
the Fourier inversion formula, 

⑷ /(x)= f(0e 2vix ^d^. 

Under a suitable hypotheses on /, the identity (4) actually holds, and 
much of the theory in this chapter aims at proving and exploiting this 
relation. The validity of the Fourier inversion formula is also suggested 
by the following simple observation. Suppose / is supported in a finite 
interval contained in / = [—L/2, i/2], and we expand / in a Fourier series 
on I. Then, letting L tend to infinity, we are led to (4) (see Exercise 1). 

The special properties of the Fourier transform make it an important 
tool in the study of partial differential equations. For instance, we shall 
see how the Fourier inversion formula allows us to analyze some equations 
that are modeled on the real line. In particular, following the ideas 
developed on the circle, we solve the time-dependent heat equation for 
an infinite rod and the steady-state heat equation in the upper half-plane. 

In the last part of the chapter we discuss further topics related to the 
Poisson summation formula, 

^2f ( n ) 二 六 n )， 


which gives another remarkable connection between periodic functions 
(and their Fourier series) and non-periodic functions on the line (and 
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their Fourier transforms). This identity allows us to prove an assertion 
made in the previous chapter, namely, that the heat kernel Ht(x) satisfies 
the properties of a good kernel. In addition, the Poisson summation 
formula arises in many other settings, in particular in parts of number 
theory, as we shall see in Book II. 

We make a final comment about the approach we have chosen. In our 
study of Fourier series, we found it useful to consider Riemann integrable 
functions on the circle. In particular, this generality assured us that even 
functions that had certain discontinuities could be treated by the theory. 
In contrast, our exposition of the elementary properties of the Fourier 
transform is stated in terms of the Schwartz space S of testing functions. 
These are functions that are indefinitely differentiable and that, together 
with their derivatives, are rapidly decreasing at infinity. The reliance on 
this space of functions is a device that allows us to come quickly to the 
main conclusions, formulated in a direct and transparent fashion. Once 
this is carried out, we point out some easy extensions to a somewhat 
wider setting. The more general theory of Fourier transforms (which 
must necessarily be based on Lebesgue integration) will be treated in 
Book III. 

1 Elementary theory of the Fourier transform 

We begin by extending the notion of integration to functions that are 
defined on the whole real line. 

1.1 Integration of functions on the real line 

Given the notion of the integral of a function on a closed and bounded 
interval, the most natural extension of this definition to continuous func¬ 
tions over R is 


pN 

/ f(x) dx 
J-N 



Of course, this limit may not exist. For example, it is clear that if 
f(x) = 1, or even if f(x) = 1/(1 + |ar|), then the above limit is infinite. 
A moment’s reflection suggests that the limit will exist if we impose on 
f enough decay as |x| tends to infinity. A useful condition is as follows. 

A function / defined on R is said to be of moderate decrease if / 
is continuous and there exists a constant A > 0 so that 


I 離 


for all x G M. 
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This inequality says that / is bounded (by A for instance), and also that 
it decays at infinity at least as fast as 1/x 2 , since A/(l + x 2 ) < A/x 2 . 

For example, the function f(x) = 1/(1 + \x\ n ) is of moderate decrease 
as long as n > 2. Another example is given by the function e~ a ^ for 
a > 0. 

We shall denote by A1(M) the set of functions of moderate decrease 
on R. As an exercise, the reader can check that under the usual addition 
of functions and multiplication by scalars, forms a vector space 

over C. 

We next see that whenever / belongs to ^(M), then we may define 

/ oo pN 

f(x) dx = lim / f(x) dx, 

-oo N—ooJ_ n 

where the limit now exists. Indeed, for each N the integral In = 
J^ N f(x) dx is well defined because / is continuous. It now suffices to 
show that {In} is a Cauchy sequence, and this follows because if M > iV, 
then 


\Im — < 


In<\x\<m 


/(^) dx 


^ A J 

2A 


< 


N 


f ^ 

N<\x\<M x2 
—>• 0 as iV —^ oo. 


Notice we have also proved that ^ , >iV f(x) dx —»• 0 as iV —»• oo. At this 
point, we remark that we may replace the exponent 2 in the definition 
of moderate decrease by 1 + e where e 〉 0 ; that is, 

|/(x)| < - . M , for all a: G IR. 

u v yi _ 1 + I ㈣ 1 +e 

This definition would work just as well for the purpose of the theory 
developed in this chapter. We chose e = 1 merely as a matter of conve¬ 
nience. 

We summarize some elementary properties of integration over R in a 
proposition. 

Proposition 1.1 The integral of a function of moderate decrease defined 
by (5) satisfies the following properties: 
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(i) Linearity: if f,g ^ ^(M) and a, 6 G C ? then 

1*00 roo /*oo 

/ (a/(x) + bg(x)) dx = a f(x) dx-\-b g(x) dx. 

J —oo J —oo J —oo 

(ii) Translation invariance: for every h we have 

rOO pOO 

/ f(x — h) dx = f(x) dx. 

J — oo J —oo 

(iii) Scaling under dilations: if 5 > 0, then 

POO rOO 

5 / f{Sx) dx = f (x) dx• 

J —oo J —oo 

(iv) Continuity: if f E M. (IR) ; then 

POO 

/ \f(x — h) — f(x)\ dx —> 0 as h — 0• 
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We say a few words about the proof. Property (i) is immediate. To 
verify property (ii), it suffices to see that 

nN nN 

/ f(x — h) dx _ f(x) dx ^ 0 as N ^ oo. 

J-N J-N 


Since J^ N f(x — h) dx = f(x) dx, the above difference is majorized 

by 


p-N 


nN 

/ fix) dx 

+ 

/ f(x) dx 

J-N-h 


JN-h 


A / 

1 + N 2 


for large N, which tends to 0 as iV tends to infinity. 

The proof of property (iii) is similar once we observe that 5 J^ N f(5x) dx 

J^ N f(x) dx. 

To prove property (iv) it suffices to take \h\ < 1. For a preassigned e > 0, 
we first choose N so large that 


f |/(x)| dx < e/4 and 
J\x\>N 




\f(x - h)\dx < e/4. 


Now with N fixed, we use the fact that since / is continuous, it is uni¬ 
formly continuous in the interval [—TV — 1， TV + 1]. Hence 
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- |/(x — h) — f(x)\ —> 0 as /i tends to 0. So we can take h so 
small that this supremum is less than e/47V. Altogether, then, 


r N 


\f(x -h) - f(x)\dx< \f(x -h) - f(x)\ dx 
J-N 


< e/2 + e/4 + e/4 = e, 
and thus conclusion (iv) follows. 


\f(x-h)\dx 


f \x\>N 


\f{x)\dx 


1.2 Definition of the Fourier transform 

If / G we define its Fourier transform for ^ G M by 

「 f(x)e~^dx. 

J —OO 

Of course, |e _27rz ^| = 1, so the integrand is of moderate decrease, and 
the integral makes sense. 

In fact, this last observation implies that / is bounded, and moreover, 
a simple argument shows that / is continuous and tends to 0 as |^| —^ oo 
(Exercise 5). However, nothing in the definition above guarantees that 
/ is of moderate decrease, or has a specific decay. In particular, it is not 
clear in this context how to make sense of the integral f(^)e 27rzx ^ 
and the resulting Fourier inversion formula. To remedy this, we introduce 
a more refined space of functions considered by Schwartz which is very 
useful in establishing the initial properties of the Fourier transform. 

The choice of the Schwartz space is motivated by an important prin¬ 
ciple which ties the decay of / to the continuity and differentiability 
properties of / (and vice versa): the faster / ⑹ decreases as |^| —»• oo, 
the “smoother” f must be. An example that reflects this principle is 
given in Exercise 3. We also note that this relationship between / and / 
is reminiscent of a similar one between the smoothness of a function on 
the circle and the decay of its Fourier coefficients; see the discussion of 
Corollary 2.4 in Chapter 2. 


1.3 The Schwartz space 

The Schwartz space on K. consists of the set of all indefinitely differ¬ 
entiable functions / so that / and all its derivatives 尸， 
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are rapidly decreasing, in the sense that 

sup \x\ k \f W (x)\ < oo for every > 0. 

We denote this space by S = <S(M), and again, the reader should verify 
that S(M) is a vector space over C. Moreover, if / G <S(M), we have 

f f (x) = ^ G S(M) and xf(x)G 5(M). 

This expresses the important fact that the Schwartz space is closed under 
differentiation and multiplication by polynomials. 

A simple example of a function in S(M) is the Gaussian defined by 

f(x) = e~ x \ 

which plays a central role in the theory of the Fourier transform, as well 
as other fields (for example, probability theory and physics). The reader 
can check that the derivatives of / are of the form P(x)e~ x where P is 
a polynomial, and this immediately shows that / G <S(M). In fact, e~ ax 
belongs to S(M) whenever a > 0. Later, we will normalize the Gaussian 
by choosing a = tt. 



An important class of other examples in S(M) are the “bump func- 
tions” which vanish outside bounded intervals (Exercise 4). 

As a final remark, note that although e~^ decreases rapidly at infinity, 
it is not differentiable at 0 and therefore does not belong to 5(M). 
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1.4 The Fourier transform on S 


The Fourier transform of a function / G S(M.) is defined by 

m = r f(x)e- 2 ^dx. 

J —OO 


Some simple properties of the Fourier transform are gathered in the fol¬ 
lowing proposition. We use the notation 


/ ⑷ 


m 


to mean that / denotes the Fourier transform of /. 


Proposition 1.2 If f E S(M) then: 

(i) f(x + h) —— f(0 e，27rlh ^ whenever /i G IR. 

(ii) f(x)e~ 27Tlxh —— ^ /(^ + h) whenever /i G M. 

(iii) f(Sx ) —— > 5 -1 /(5 -1 ^) whenever 5 > 0. 

(iv) f\x) — > 2tt^/(0. 

(v) -2nixf(x) — > 

In particular, except for factors of 2ni, the Fourier transform inter¬ 
changes differentiation and multiplication by x. This is the key property 
that makes the Fourier transform a central object in the theory of differ¬ 
ential equations. We shall return to this point later. 


Proof. Property (i) is an immediate consequence of the translation 
invariance of the integral, and property (ii) follows from the definition. 
Also, the third property of Proposition 1.1 establishes (iii). 

Integrating by parts gives 


r N 


-N 


dx 二 \f(x)e 


-2nix^ 


N 


r N 


-N 


2nig f(x)e~ 27rlx ^ dx, 


-N 


so letting N tend to infinity gives (iv). 

Finally, to prove property (v), we must show that / is differentiable 
and find its derivative. Let e > 0 and consider 


m+h)~m 

h 


(-2ttzx/)(0= 



f(x)e~ 27Tix ^ 


e—27rixh 


- 1 


h 


+ 27rix 


dx. 
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Since f(x) and xf(x) are of rapid decrease, there exists an integer N 
so that /| x |〉jv \ f{ x )\ dx < e and /|^|>^ \ x \ |/(^)| dx < e. Moreover, for 
\x\ < N ， there exists /io so that |/i| < fto implies 


- 27 vixh 


2/rrix 


< 


N 


Hence for |/i| < fto we have 


m+h)-m) 


h 


- i~2TTixf)(X) 


r N 


< 


J-N 

<C f e. 


/(^)e" 


-2nix^ 


n — 27 rixh 


- 1 


2ttix 


dx + Ce 


Theorem 1.3 // / G then f G <S(M). 


The proof is an easy application of the fact that the Fourier transform 
interchanges differentiation and multiplication. In fact, note that if / G 
its Fourier transform / is bounded; then also, for each pair of 
non-negative integers i and fc, the expression 



is bounded, since by the last proposition, it is the Fourier transform of 


1 ( d 、 

(27ri) k 


[(-2mxY f(x)}. 


The proof of the inversion formula 

f(x) = f°° Ki)e 2 ^d^ for/e «S(E), 

J —oo 

which we give in the next section, is based on a careful study of the 

2 

function e~ ax , which, as we have already observed, is in S(R) if a > 0. 
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The Gaussians as good kernels 

We begin by considering the case a = n because of the normalization: 

(6) / e nx dx = 1. 

J —oo 


To see why (6) is true, we use the multiplicative property of the expo¬ 
nential to reduce the calculation to a two-dimensional integral. More 
precisely, we can argue as follows: 



where we have evaluated the two dimensional integral using polar coor¬ 
dinates. 

The fundamental property of the Gaussian which is of interest to us, 
and which actually follows from (6), is that e~ nx equals its Fourier 
transform! We isolate this important result in a theorem. 

Theorem 1.4 If f(x) = e _7rx2 , then /⑹ = / ⑹. 

Proof. Define 

F(0 二 hO = 「 e-^e- 2 ^dx, 

J —OO 

and observe that F(0) = 1, by our previous calculation. By property (v) 
in Proposition 1.2, and the fact that f ， (x) = — 27rx/(x)，we obtain 

/ OO pOO 

f[x)(—2Trix)e - 2ni 袄 dx 二 i f'{x)e~ 2 ^dx. 

-oo J —oo 

By (iv) of the same proposition, we find that 












Ibookroot October 20, 2007 


1. Elementary theory of the Fourier transform 


139 



follows that G’ ⑹ = 0, hence G is constant. Since F(0) = 1, we conclude 
that G is identically equal to 1, therefore F(^) = e _7r ^ , as was to be 
shown. 

The scaling properties of the Fourier transform under dilations yield 
the following important transformation law, which follows from (iii) in 
Proposition 1.2 (with 5 replaced by 5 _1 , 2 ). 

Corollary 1.5 If6>0 and Ks(x) = 5 _1//2 e _7ra;2 / 5 , then Ks(^) = e _7n ^ 2 . 

We pause to make an important observation. As 5 tends to 0, the 
function Ks peaks at the origin, while its Fourier transform Ks gets 
flatter. So in this particular example, we see that Ks and Ks cannot both 
be localized (that is, concentrated) at the origin. This is an example of a 
general phenomenon called the Heisenberg uncertainty principle, which 
we will discuss at the end of this chapter. 

We have now constructed a family of good kernels on the real line, 
analogous to those on the circle considered in Chapter 2. Indeed, with 


Ks{x) = <r 1/2 e -7ra2/5 


we have: 


(0 JZc K s(x)dx = 1. 
(ii) iZo \I<s(x)\dx < M. 


(iii) For every rj > 0, we have J^ >7] \Ks(x)\ dx —> 0 as 5 —> 0. 

To prove (i), we may change variables and use (6), or note that the 
integral equals i^(0), which is 1 by Corollary 1.5. Since K§ > 0, it is 
clear that property (ii) is also true. Finally we can again change variables 
to get 



as 5 tends to 0. We have thus proved the following result. 

Theorem 1.6 The collection {^}( 5>0 is a family of good kernels 
as 6 ^ 0. 

We next apply these good kernels via the operation of convolution, 
which is given as follows. If f, g G 5(M), their convolution is defined by 



⑺ 
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For a fixed value of x, the function f(x — t)g(t) is of rapid decrease in 
hence the integral converges. 

By the argument in Section 4 of Chapter 2 (with a slight modification), 
we get the following corollary. 

Corollary 1.7 // / G <S(IR ) ， then 

(/ * Ks)(x) —>• f(x) uniformly in x as 5 ^ 0. 

Proof. First, we claim that / is uniformly continuous on R. Indeed, 
given e > 0 there exists i? > 0 so that |/(x)| < e/4 whenever \x\ > R. 
Moreover, / is continuous, hence uniformly continuous on the compact 
interval [—i?, i?], and together with the previous observation, we can find 
?7 > 0 so that \f{x) — f(y)\ < e whenever \x — y\ < 77 . Now we argue as 
usual. Using the first property of good kernels, we can write 

(/* K s )(x)-f(x)= f K s (t) [f(x -t) - f(x)] dt, 

J —OO 

and since Ks > 0 , we find 

\(f * K s )(x) - f(x)\ < f + [ K 5 {t)\f{x ~t) f{x)\dt. 

The first integral is small by the third property of good kernels, and the 
fact that / is bounded, while the second integral is also small since / 
is uniformly continuous and f Ks = 1. This concludes the proof of the 
corollary. 

1.5 The Fourier inversion 

The next result is an identity sometimes called the multiplication for- 
mula. 

Proposition 1.8 If f,g G <S(M )，then 

/ OO POO / 

f{x)g{x)dx= / f(y)g{y) dy. 

-OO J —OO 

To prove the proposition, we need to digress briefly to discuss the inter¬ 
change of the order of integration for double integrals. Suppose F(x,y) 
is a continuous function in the plane (x,y) G K 2 . We will assume the 
following decay condition on F: 


\F(x,y)\ < A/{l+x 2 ){l + y 2 ). 
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Then, we can state that for each x the function F(x,y) is of moderate 
decrease in and similarly for each fixed y the function F(x, y) is of 
moderate decrease in x. Moreover, the function Fi(x) = F(x, y) dy 
is continuous and of moderate decrease; similarly for the function F 2 (y)= 
I- 00 F(x,y)dx. Finally 



Fi (x) dx : 


F 2 (y)dy. 


The proof of these facts may be found in the appendix. 

We now apply this to F(x,y) = f(x)g(y)e -2nlxy • Then F\{x )= 
f{x)g(x), and F 2 {y) = f(y)g(y) so 



f{x)g(x) dx 



f(y)g(y) dy, 


which is the assertion of the proposition. 

The multiplication formula and the fact that the Gaussian is its own 
Fourier transform lead to a proof of the first major theorem. 

Theorem 1.9 (Fourier inversion) // / G <S(M) 7 then 

/( 和厂 

J — oo 


Proof. We first claim that 


m= mdc 


Let Gs(x) = e~ nSx2 so that Gs(0 = ⑹. By the multiplication for¬ 

mula we get 


[f{x)K s {x)dx= f hO G siO 吡 . 

J —oo j — oo 


Since Ks is a good kernel, the first integral goes to /(0) as 5 tends to 0. 
Since the second integral clearly converges to / 二 /(^) as 6 tends to 0, 
our claim is proved. In general, let F{y) = f(y + x) so that 

/ ㈤ 二糊 = 厂细炎=厂 me 2nix( dC 

•7—00 J —oo 


As the name of Theorem 1.9 suggests, it provides a formula that inverts 
the Fourier transform; in fact we see that the Fourier transform is its own 
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inverse except for the change of x to —x. More precisely, we may define 
two mappings T : S(R) S(R) and T* : S(R) S(R) by 

/ oo 疒 oo 

f(x)e~ 2mxi dx and 广⑷ ㈤ =/ g(^)e 2mxS d^. 

-oo J —oo 


Thus T is the Fourier transform, and Theorem 1.9 guarantees that 
T* o J 7 = I on <S(]R), where I is the identity mapping. Moreover, since 
the definitions of T and T 沐 differ only by a sign in the exponential, we 
see that J r (/)(?/) = ^ r *(/)(— t/), so we also have J 7 o T* = I. As a conse¬ 
quence, we conclude that is the inverse of the Fourier transform on 
<S(]R), and we get the following result. 


Corollary 1.10 The Fourier transform is a bijective mapping on the 
Schwartz space. 


1.6 The Plancherel formula 

We need a few further results about convolutions of Schwartz functions. 
The key fact is that the Fourier transform interchanges convolutions with 
pointwise products, a result analogous to the situation for Fourier series. 

Proposition 1.11 If f^g G S(M) then: 

(i) f*9& 5(E). 

(ii) f * g = g* f. 

(iii) (f * 

Proof. To prove that f 氺 g is rapidly decreasing, observe first that for 
any ^ > 0 we have sup^, |x|^|^(a: — y)\ < Ai(\ + \y\Y ， because g is rapidly 
decreasing (to check this assertion, consider separately the two cases 
\x\ < 2\y\ and \x\ > 2\y\). From this, we see that 

/ oo 

l/(2/)l(l + \y\Ydy, 

-oo 

so that x e (f * g)(x) is a bounded function for every £>0. These esti¬ 
mates carry over to the derivatives of / * 仏 thereby proving that 
/ * ^ G <S(M) because 

(去） （/” 勝 (/*(£) 咖） & A ； : 1,2,…. 
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This identity is proved first for fc = 1 by differentiating under the inte¬ 
gral defining f 木 g. The interchange of differentiation and integration is 
justified in this case by the rapid decrease of dg/dx. The identity then 
follows for every k by iteration. 

For fixed x, the change of variables x — y = u shows that 



U * g)( x ) 


f(x - u)g(u) du 二 {g * f)(x). 


This change of variables is a composition of two changes, y i—^ —y and 
y y — h (with h = x). For the first one we use the observation that 
/ 二 o F( x ) dx = F(—x) dx for any Schwartz function F, and for the 

second, we apply (ii) of Proposition 1.1 

Finally, consider F(x,y) = f(y)g(x — y)e~ 27Tlx ^. Since / and g are 
rapidly decreasing, considering separately the two cases \x\ < 2\y\ and 
\x\ > 2\y\, we see that the discussion of the change of order of integration 
after Proposition 1.8 applies to F. In this case F\{x) = (/ * g)(x)e~ 27Tlx ^, 
and F 2 {y) = f(y)e~ 2niyS g^). Thus F 1 {x)dx = F 2 {y) dy, which 

implies (iii). The proposition is therefore proved. 

We now use the properties of convolutions of Schwartz functions to 
prove the main result of this section. The result we have in mind is the 
analogue for functions on R of ParsevaFs identity for Fourier series. 

The Schwartz space can be equipped with a Hermitian inner product 



whose associated norm is 



The second major theorem in the theory states that the Fourier transform 
is a unitary transformation on <S(M). 

Theorem 1.12 (Plancherel) If f E S(R) then ||/|| = ||/||. 

Proof. If / G 5(M) define f b (x) = f(—x). Then / b (^) = /((). Now 
let ft = / * / b . Clearly, we have 
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The theorem now follows from the inversion formula applied with x = 0, 
that is, 



1.7 Extension to functions of moderate decrease 

In the previous sections, we have limited our assertion of the Fourier 
inversion and Plancherel formulas to the case when the function involved 
belonged to the Schwartz space. It does not really involve further ideas to 
extend these results to functions of moderate decrease, once we make the 
additional assumption that the Fourier transform of the function under 
consideration is also of moderate decrease. Indeed, the key observation, 
which is easy to prove, is that the convolution f ^ g of two functions / and 
g of moderate decrease is again a function of moderate decrease (Exer- 

- - ■- A 

cise 7); also f ^ g = fg- Moreover, the multiplication formula continues 
to hold, and we deduce the Fourier inversion and Plancherel formulas 
when / and / are both of moderate decrease. 

This generalization, although modest in scope, is nevertheless useful 
in some circumstances. 

1.8 The Weierstrass approximation theorem 

We now digress briefly by further exploiting our good kernels to prove 
the Weierstrass approximation theorem. This result was already alluded 
to in Chapter 2. 

Theorem 1.13 Let f be a continuous function on the closed and bounded 
interval [a, b] C M. Then, for any e > 0 ， there exists a polynomial P such 
that 


sup \f(x) - P(x)\ < e. 


xe[a,b] 


In other words, f can be uniformly approximated by polynomials. 

Proof. Let [—M, M] denote any interval that contains [a, 6] in its 
interior, and let ^ be a continuous function on M that equals 0 outside 
[—M, M] and equals / in [a, 6]. For example, extend / as follows: from b 
to M define y by a straight line segment going from f(b) to 0, and from 
a to —M by a straight line segment from /(a) also to 0. Let B be a 
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bound for 仏 that is, |^(x)| < B for all x. Then, since {Ks} is a family of 
good kernels, and g is continuous with compact support, we may argue 
as in the proof of Corollary 1.7 to see that g * Ks converges uniformly 
to 沒 as <5 tends to 0. In fact, we choose 5o so that 

\g(x) — (g * Ks 0 )(x)\ < e/2 for all x G M. 

Now, we recall that e x is given by the power series expansion e x = 
X] 二 0 x n /n! which converges uniformly in every compact interval of M. 
Therefore, there exists an integer N so that 

\Ks 0 (x)~ R(x)\ < ^ for all x G [—2M,2M] 

where R(x) = J2n=o 卜 7 ^!^ 0 ) • Then, recalling that g vanishes 
outside the interval [—M, M], we have that for all x G [—M, M] 

nM 

1(^* K 5o ){x) - (g^R)(x)\ = / g(t) [K So (x -t)- R(x - t)] dt 

J-M 

nM 

< / \g{t)\ \K So {x -t) - R(x -t)\dt 

J-M 

< ^MB sup \K So {z) - R(z)\ 

ze[-2M,2M] 

< e/2. 

Therefore, the triangle inequality implies that \g(x) — (^ * -R)(^)| < c 
whenever x G [—M, M], hence \ f(x) — (g ^ R)(x)\ < e when x G [a, b\. 

Finally, note that ^ * i? is a polynomial in the x variable. Indeed, by 
definition we have (g ^ R)(x) = J_ M g(t)R(x — t) dt, and R(x — t) is a 
polynomial in x since it can be expressed, after several expansions, as 
R(x — t) = Yltn a n (t)x n where the sum is finite. This concludes the proof 
of the theorem. 


2 Applications to some partial differential equations 

We mentioned earlier that a crucial property of the Fourier transform 
is that it interchanges differentiation and multiplication by polynomials. 
We now use this crucial fact together with the Fourier inversion theorem 
to solve some specific partial differential equations. 

2.1 The time-dependent heat equation on the real line 

In Chapter 4 we considered the heat equation on the circle. Here we 
study the analogous problem on the real line. 
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Consider an infinite rod, which we model by the real line, and suppose 
that we are given an initial temperature distribution f(x) on the rod 
at time t = 0. We wish now to determine the temperature u(x,t) at 
a point x at time t > 0. Considerations similar to the ones given in 
Chapter 1 show that when u is appropriately normalized, it solves the 
following partial differential equation: 


⑻ 


du 

~dt 


d 2 u 
dx 2 ’ 


called the heat equation. The initial condition we impose is 

u(x,0) = f{x). 

Just as in the case of the circle, the solution is given in terms of a 
convolution. Indeed, define the heat kernel of the line by 


Ht(x) = Ks(x), with 5 = 

so that 

私⑷ =(47t!)V 2 r* 2 ’ and 兔⑹ =e _ 4 

Taking the Fourier transform of equation (8) in the x variable (for¬ 
mally) leads to 

■^■(4, 亡 ） =—47T 2 ^ 2 u(^,t). 

Fixing this is an ordinary differential equation in the variable t (with 
unknown •)), so there exists a constant so that 


We may also take the Fourier transform of the initial condition and obtain 
0) = f ⑹， hence A(() = /((). This leads to the following theorem. 

Theorem 2.1 Given f G <S(M) ; let 

u(x, t) = (/ * fort >0 


where 7i t is the heat kernel. Then: 

(i) The function u is C 2 when x G M and t > 0, and u solves the heat 
equation. 
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(ii) u(x, t) —>• f(x) uniformly in x as t ^ 0. Hence if we set u(x, 0)= 
f(x), then u is continuous on the closure of the upper half-plane 

= {(x, t) : x G M, t> 0}. 

(iii) \u(x^t) — f(x)\ 2 dx —> 0 as t — 0• 

Proof. Because u = f * 7~Ct, taking the Fourier transform in the x- 
variable gives u = fTL t ： and so u(^t) = /(^)e _47r2 ^ 2t . The Fourier inver¬ 
sion formula gives 

U(x ， t) 二「 / ⑹ e - 4 - 2 圮 V 略成 . 

*/—OO 

By differentiating under the integral sign, one verifies (i). In fact, one 
observes that u is indefinitely differentiable. Note that (ii) is an imme¬ 
diate consequence of Corollary 1.7. Finally, by PlancherePs formula, we 
have 

/ OO A 

陳，亡)一 /(Of ^ 

-oo 

二厂 。 \f(0\ 2 \e~ 4n2te -l\dC 

J —oo 

To see that this last integral goes to 0 as t > 0, we argue as follows: 
since | e -47r 2 t^ 2 _ x| < 2 and / G <S(M), we can find N so that 

[ 1/ ⑹ IV 4 々-1| 处 < e , 

and for all small t we have sup^| <iV - |/(^)| 2 |e _47r2 ^ 2 — 1| < e/2N since / 
is bounded. Thus 

f |/(0| 2 |e _47f2 M 2 — l\d^ < e for all small t. 

This completes the proof of the theorem. 

The above theorem guarantees the existence of a solution to the heat 
equation with initial data /. This solution is also unique, if uniqueness 
is formulated appropriately. In this regard, we note that u = f 
f G 5(M), satisfies the following additional property. 

Corollary 2.2 u(., t) belongs to S(M) uniformly in t, in the sense that 
for any T > 0 

(9) sup \x 

x e k 

0 < t < T 


d e 


< oo for each k,£ > 0. 
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Proof. This result is a consequence of the following estimate: 


\u{x,t)\ < [ \f{x-y)\n t {y)dy+ [ \f(x-y)\H t (y)dy 



— cx 2 /t 



(1+1， 


Indeed, since / is rapidly decreasing, we have \ f(x — y)\ < CW/(1 + |^|)^ 
when \y\ < \x\/2. Also, if\y\ > \x\/2 then Ht(y) < Ct _1 / 2 e _ca;2 ", and we 
obtain the above inequality. Consequently, we see that u(x^t) is rapidly 
decreasing uniformly for Q < t < T. 

The same argument can be applied to the derivatives of u in the x 
variable since we may differentiate under the integral sign and apply the 
above estimate with / replaced by f, and so on. 

This leads to the following uniqueness theorem. 

Theorem 2.3 Suppose u(x, t) satisfies the following conditions: 

(i) u is continuous on the closure of the upper half-plane. 

(ii) u satisfies the heat equation for t > 0. 

(iii) u satisfies the boundary condition u(x, 0) = 0. 

(iv) u(-,t) G S(M) uniformly in t，as in (9). 

Then, we conclude that u = 0. 

Below we use the abbreviations d^u and dtu to denote d^u/dx^ and 
du/dt^ respectively. 

Proof. We define the energy at time t of the solution u(x, t) by 



Clearly E{t) > 0. Since 五 (0) = 0 it suffices to show that E 1 is a de¬ 
creasing function, and this is achieved by proving that dE/dt < 0. The 
assumptions on u allow us to differentiate E(t) under the integral sign 


dE_ 

dt 


[dtu(x, t)u(x, t) + u(x, t)dtu(x, t)] dx. 


But u satisfies the heat equation, therefore dtu = d^.u and dtu = so 
that after an integration by parts, where we use the fact that u and its 
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x derivatives decrease rapidly as \x\ —> oo, we find 


dE 

dt 



[d^.u(x, t)u(x : t) + u(x, t)d^.u(x,t)] dx 


/ [d x u(x, t)d x u(x : t) + d x u(x, t)d x u(x, t)] dx 
Jr 


=—2 f \d x u(x^t)\ 2 dx 
Jr 

< 0 , 


as claimed. Thus E(t) = 0 for all t, hence w = 0. 

Another uniqueness theorem for the heat equation, with a less restric¬ 
tive assumption than (9), can be found in Problem 6. Examples when 
uniqueness fails are given in Exercise 12 and Problem 4. 


2.2 The steady-state heat equation in the upper half-plane 

The equation we are now concerned with is 


( 10 ) 


Au 


d 2 u d 2 u 
dx 2 + dy 2 


0 


in the upper half-plane = {(x, y) : x G M, y > 0}. The boundary con¬ 
dition we require is w(x,0) = f(x). The operator A is the Laplacian and 
the above partial differential equation describes the steady-state heat dis¬ 
tribution in Ml subject to u = f on the boundary. The kernel that solves 
this problem is called the Poisson kernel for the upper half-plane, and 
is given by 


V v (x) = - ^ - ^ where x G M and y > 0. 

7T x z y z 

This is the analogue of the Poisson kernel for the disc discussed in Sec¬ 
tion 5.4 of Chapter 2. 

Note that for each fixed y the kernel V y is only of moderate decrease 
as a function of x, so we will use the theory of the Fourier transform 
appropriate for these types of functions (see Section 1.7). 

We proceed as in the case of the time-dependent heat equation, by 
taking the Fourier transform of equation (10) (formally) in the x variable, 
thereby obtaining 


-An 2 ^ 2 u^,y) + 


d 2 u 

dy 2 


(^,y) = o 
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with the boundary condition 0) = /(0. The general solution of this 
ordinary differential equation in y (with ^ fixed) takes the form 

u(^y) = A(i)e~^ v + B(^)e 2n ^ v . 


If we disregard the second term because of its rapid exponential increase 
we find, after setting y = 0, that 

u(^y) = me- 2nmv . 


Therefore u is given in terms of the convolution of / with a kernel whose 
Fourier transform is e~ 2n ^ y . This is precisely the Poisson kernel given 
above, as we prove next. 

Lemma 2.4 The following two identities hold: 



e~ 2n ^ v e 2 ^ x d^r y (x), 



V y {x)e~ 2vixi dx^ e~ 2 ^\ y . 


Proof. The first formula is fairly straightforward since we can split 
the integral from —oo to 0 and 0 to oo. Then, since y > 0 we have 


2ir^y ^2ni^x 




0 2ni(x+iy)$ 




e 2ni(x+iy)^ 

27ri(x + iy) 


and similarly, 


2ni(x + iy) 


e 27r^y e 27r^x 




2iri(x — iy) 


Therefore 


0 -27r\^\y 2ni^x 




2jri(x — iy) 27ri(x + iy) tt(x 2 + y 2 ) 


The second formula is now a consequence of the Fourier inversion theorem 
applied in the case when / and / are of moderate decrease. 


Lemma 2.5 The Poisson kernel is a good kernel on K. as y ^ 0. 
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Proof. Setting ^ = 0 in the second formula of the lemma shows that 
f 二 V y (x) dx = 1, and clearly V y (x) > 0, so it remains to check the last 
property of good kernels. Given a fixed 5 > 0, we may change variables 
u = x/y so that 

[ o V o dx= [ dU = [arctan?/]^ = tt/ 2 - arctan(5/y), 

Js X z + Js/y l + u z ° /y 

and this quantity goes to 0 as y —>• 0. Since V y {x) is an even function, 
the proof is complete. 

The following theorem establishes the existence of a solution to our 
problem. 

Theorem 2.6 Given f G <S(M) ? let u{x ， y) = (/ * V y ){x). Then: 

(i) u{x,y) is C 2 in and Au = 0. 

(ii) u(x, y) —> f(x) uniformly as y ^ 0. 

(iii) \u(x,y) - f(x)\ 2 dx -^0 as y ^0. 

(iv) If ?/(x,0) = f(x), then u is continuous on the closure of the 
upper half-plane, and vanishes at infinity in the sense that 

u(x, 7 /) —> 0 as \x\ y ^ oo. 

Proof. The proofs of parts (i), (ii), and (iii) are similar to the case of 
the heat equation, and so are left to the reader. Part (iv) is a consequence 
of two easy estimates whenever / is of moderate decrease. First, we have 

|(/* 聊1 〜( (1 ^ ) + 右） 

which is proved (as in the case of the heat equation) by splitting the 
integral f 二 f(x — t)V y {t) dt into the part where \t\ < \x\/2 and the part 
where \t\ > \x\/2. Also, we have |(/ * V y ){x)\ < C/y, since sup x V y (x) < 
c/y- 

Using the first estimate when \x\ > \y\ and the second when \x\ < \y\ 
gives the desired decrease at infinity. 

We next show that the solution is essentially unique. 

Theorem 2.7 Suppose u is continuous on the closure of the upper half- 
plane satisfies Au = 0 for (x,y) G u(x, 0) = 0 7 and u(x,y) van¬ 
ishes at infinity. Then u = 0. 
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A simple example shows that a condition concerning the decay of u at 
infinity is needed: take u(x, y) = y. Clearly u satisfies the steady-state 
heat equation and vanishes on the real line, yet u is not identically zero. 

The proof of the theorem relies on a basic fact about harmonic func¬ 
tions, which are functions satisfying Au = 0. The fact is that the value 
of a harmonic function at a point equals its average value around any 
circle centered at that point. 

Lemma 2.8 (Mean-value property) Suppose is an open set in M 2 
and let u be a function of class C 2 with Au = 0 in Q. If the closure of 
the disc centered at (x, y) and of radius R is contained in Q, then 


1 严 . 

u(x,y) = ^ — u[x + r cos 6^y -\-r sin 6) d6 

27r J 0 


for all 0 < r < R. 

Proof. Let U (r, 0) = u{x + r cos 6,y rs'mO). Expressing the Lapla- 
cian in polar coordinates, the equation Au = 0 then implies 


d 2 U d ( dU\ 
~d^ + r d~r yw)- 


If we define F(r) = ^ / 0 2?r U (r, 6) d0, the above gives 


d_ 

dr 




d 2 U 

~W 


(r, 6) d9. 


The integral of d 2 U/d6 2 over the circle vanishes since dU/dO is peri¬ 
odic, hence r 备 (r^) = 0, and consequently rdF/dr must be constant. 
Evaluating this expression at r = 0 we find that dF/dr = 0. Thus F is 
constant, but since F(0) = u(x, y), we finally find that F(r) = u(x,y) for 
all 0 < r < i?, which is the mean-value property. 

Finally, note that the argument above is implicit in the proof of The¬ 
orem 5.7, Chapter 2. 

To prove Theorem 2.7 we argue by contradiction. Considering sepa¬ 
rately the real and imaginary parts of u, we may suppose that u itself 
is real-valued, and is somewhere strictly positive, say n(xo, yo) > 0 for 
some xo G M and yo > 0. We shall see that this leads to a contradiction. 
First, since u vanishes at infinity, we can find a large semi-disc of ra- 
dius i?, = {(x, y) : x 2 -\-y 2 < i?, y > 0} outside of which u(x, y) < 

yo). Next, since u is continuous in 舛 ， it attains its maximum 
M there, so there exists a point (xi,yi) G D + R with u(xi^yi) = M, while 
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u(x, y) < M in the semi-disc; also, since u(x,y) < ^u(xo, yo) < M/2 out¬ 
side of the semi-disc, we have u(x, y) < M throughout the entire upper 
half-plane. Now the mean-value property for harmonic functions implies 

1 严 

u(xi,yi) = ^ — u(xi + pcos?/i + psin 9) d6 

^ Jo 

whenever the circle of integration lies in the upper half-plane. In par¬ 
ticular, this equation holds if 0 < p < yi ， Since u(xi,yi) equals the 
maximum value M, and u{x\ + pcos 6 : yi + psin0) < M, it follows by 
continuity that u(xi + pcosO^yi + psin 9) = M on the whole circle. For 
otherwise u(x, y) < M — e, on an arc of length 5 > 0 on the circle, and 
this would give 

1 f 2n / 、 e 6 

/ u(xi + pcos + psinS) d6 < M — 

27T 7o 27T 

contradicting the fact that u(xi,yi) = M. Now letting p —> ?/i, and using 
the continuity of u again, we see that this implies u(xi^0) = M > 0, 
which contradicts the fact that u{x^ 0) = 0 for all x. 


3 The Poisson summation formula 

The definition of the Fourier transform was motivated by the desire for 
a continuous version of Fourier series, applicable to functions defined 
on the real line. We now show that there exists a further remarkable 
connection between the analysis of functions on the circle and related 
functions on M. 

Given a function / G <S(M) on the real line, we can construct a new 
function on the circle by the recipe 

oo 

F 1( X ) = E f(x + n). 

n=—oo 

Since / is rapidly decreasing, the series converges absolutely and uni¬ 
formly on every compact subset of M, so F\ is continuous. Note that 
F\{x + 1) = F\{x) because passage from n to n + 1 in the above sum 
merely shifts the terms on the series defining F\ (x). Hence F\ is periodic 
with period 1. The function F\ is called the periodization of /. 

There is another way to arrive at a “periodic version” of /, this time 
by Fourier analysis. Start with the identity 

f(x )= 「 

J — OO 
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and consider its discrete analogue, where the integral is replaced by a 
sum 

oo 

F 2 (x)^ J2 f(n)e Mnx . 


Once again, the sum converges absolutely and uniformly since / belongs 
to the Schwartz space, hence is continuous. Moreover, F 2 is also 
periodic of period 1 since this is the case for each one of the exponentials 

^27rinx 

The fundamental fact is that these two approaches, which produce F\ 
and F 2 , actually lead to the same function. 

Theorem 3.1 (Poisson summation formula) If f E <S(M) ; then 

00 00 

E /(^ + n)- E f{n)e 2 ^ inx . 


In particular, setting x = 0 we have 


00 00 

H /w = H /w_ 


In other words, the Fourier coefficients of the periodization of / are 
given precisely by the values of the Fourier transform of / on the integers. 

Proof. To check the first formula it suffices, by Theorem 2.1 in 
Chapter 2, to show that both sides (which are continuous) have the 
same Fourier coefficients (viewed as functions on the circle). Clearly, the 
m th Fourier coefficient of the right-hand side is f(m). For the left-hand 
side we have 


5Z f( x + n ) 


-2irimx 


dx 


e r 


f(x + n)e~ Zmmx dx 


OO 


nn-\-l 


I f[ y )e_ 2 _ dy 

o Jn 

f(y)e— 如 — dy 


/ ㈣ ， 


where the interchange of the sum and integral is permissible since / is 
rapidly decreasing. This completes the proof of the theorem. 
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We observe that the theorem extends to the case when we merely 
assume that both / and / are of moderate decrease; the proof is in fact 
unchanged. 

It turns out that the operation of periodization is important in a 
number of questions, even when the Poisson summation formula does 
not apply. We give an example by considering the elementary function 
f(x) = 1/x, x ^ 0. The result is that Y^=-oo l/( x + n )? when summed 
symmetrically, gives the partial fraction decomposition of the cotangent 
function. In fact this sum equals 7rcot ttx, when x is not an integer. 
Similarly with f(x) = 1/a: 2 , we get J]^=-oo 1/(^ + ^) 2 = 7r 2 /(sin7rx) 2 , 
whenever x 车％ (see Exercise 15). 

3.1 Theta and zeta functions 

We define the theta function i)(s) for s > 0 by 

oo 

办 ( s ) 二 J2 e~ nn2s . 

n=—oo 

The condition on 5 ensures the absolute convergence of the series. A 
crucial fact about this special function is that it satisfies the following 
functional equation. 

Theorem 3.2 5 _1//2 i?(1/s) = i9(s) whenever 5 > 0. 

The proof of this identity consists of a simple application of the Poisson 
summation formula to the pair 

f(x ) 二 e - 篇 2 and / ⑹ =s- 1/2 e_< 2/s . 

The theta function i9(s) also extends to complex values of s when 
Re(5) > 0, and the functional equation is still valid then. The theta 
function is intimately connected with an important function in number 
theory, the zeta function ^( 5 ) defined for Re(s) > 1 by 



Later we will see that this function carries essential information about 
the prime numbers (see Chapter 8). 

It also turns out that 汐 ， and another important function T are 
related by the following identity: 

7T s / 2 r( s /2)c ㈤ = 
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which is valid for 5 > 1 (Exercises 17 and 18). 

Returning to the function i?, define the generalization 0(z|r) given by 

oo 

Q(z\t)^ J2 e inn2r e 2ninz 

n=—oo 

whenever Im(r) > 0 and 2 ： G C. Taking z = 0 and r = iswe get 0 (z|t)= 
办 (s). 

3.2 Heat kernels 

Another application related to the Poisson summation formula and the 
theta function is the time-dependent heat equation on the circle. A 
solution to the equation 

du d 2 u 
dt dx 2 

subject to w(x,0) = f(x), where / is periodic of period 1, was given in 
the previous chapter by 

u(x, t) = (f * H t )(x) 


where H t (x) is the heat kernel on the circle, that is, 


oo 

H t (x ) 二 e -4 八 2 W 


Note in particular that with our definition of the generalized theta func¬ 
tion in the previous section, we have 0(x|47rit) = Ht (x). Also, recall that 
the heat equation on M gave rise to the heat kernel 




^^^ e -^ 2 /4t 
(47ft) 1 / 2 


where The fundamental relation between these two 

objects is an immediate consequence of the Poisson summation formula: 

Theorem 3.3 The heat kernel on the circle is the periodization of the 
heat kernel on the real line: 


H t {x) = ^2 + n). 
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Although the proof that Tit is a good kernel on K. was fairly straightfor¬ 
ward, we left open the harder problem that Ht is a good kernel on the 
circle. The above results allow us to resolve this matter. 

Corollary 3.4 The kernel Ht(x) is a good kernel for t ^ 0. 

Proof. We already observed that /| rr |<i / / 2 ^t(x) dx = 1. Now note 
that H t > 0, which is immediate from the above formula since Tit > 0. 
Finally, we claim that when \x\ < 1/2, 


H t (x) = £ t (x), 


where the error satisfies \St(x)\ < cie~ C2 ^ with ci ， C 2 〉 0 and 0 < t < 1. 
To see this, note again that the formula in the theorem gives 

H t (x) = Tit(x) + ^2 + n); 

|n| 之 1 


therefore, since |a;| < 1/2, 


= 



e ~(x+n) 2 /4t < Ct~ X ^ ^ e ~cn 2 /t 
|n|>l n>l 


Note that n 2 /t > n 2 and n 2 /t > 1/t whenever 0 < t < 1, so e~ cn ^^ < 
e _ 2 n e _ 2 t. Hence 


\£t(x)\ < Ct~ 1/2 e~^ e ~^ n2 ^ cie~ C2/t . 

n>l 

The proof of the claim is complete, and as a result /| 工 |<"2 |^t(^)| dx ^ 0 
as t ^ 0. It is now clear that H t satisfies 

/ \H t (x)\ dx ^ 0 as t —»• 0, 

Jt]<\x\<1/2 

because Ht does. 


3.3 Poisson kernels 


In a similar manner to the discussion above about the heat kernels, we 
state the relation between the Poisson kernels for the disc and the upper 
half-plane where 


尸’⑼ 1 — 2r cos 6 -\- r 2 


and 


^ 2 /( X ) = 


i y 

7T y 2 X 2 
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Theorem 3.5 P r 、 2ixx) = + n ) w here r = e~ 2ny . 

This is again an immediate corollary of the Poisson summation formula 
applied to f(x) = V y (x) and /⑹ = 6 _2 丌 1 伽 . Of course, here we use the 
Poisson summation formula under the assumptions that / and / are of 
moderate decrease. 

4 The Heisenberg uncertainty principle 

The mathematical thrust of the principle can be formulated in terms of a 
relation between a function and its Fourier transform. The basic under¬ 
lying law, formulated in its vaguest and most general form, states that a 
function and its Fourier transform cannot both be essentially localized. 
Somewhat more precisely, if the “preponderance” of the mass of a func¬ 
tion is concentrated in an interval of length L, then the preponderance 
of the mass of its Fourier transform cannot lie in an interval of length 
essentially smaller than L_\ The exact statement is as follows. 

Theorem 4.1 Suppose ^ is a function in 5(IR) which satisfies the nor¬ 
malizing condition \^{x)\ 2 dx = 1. Then 

(/_? 剛 i 2 血)(0 2 _1 2 处) ^ 

and equality holds if and only if 吨 {x) = Ae~ Bx ^ where B > 0 and \A\ 2 = 
y/Wjn. 

In fact, we have 

(J (X- x 0 ) 2 \ip(x)\ 2 dx^j (^J (C- ^ 0 ) 2 |^(6| 2 ^) > 

for every xo, ^ 

Proof. The second inequality actually follows from the first by re¬ 
placing ^(x) by e~ 27Tlx ^ 0f ip(x + xq) and changing variables. To prove the 
first inequality, we argue as follows. Beginning with our normalizing as¬ 
sumption f \^\ 2 = 1, and recalling that ^ and t// are rapidly decreasing, 
an integration by parts gives 

|^(x)| 2 dx 

dx 
dx 

{x)^(x) + dx. 
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The last identity follows because |4| 2 = 寸水 Therefore 


< 2 / \x\ \^{x)\ \^\x)\dx 

J —oo 

.1/2 / noo \ l/ 2 

„ 2 l /,’ 〜、 | 2 i ) ( / \^j ； \x )\ 2 dx\ 


< 2 


x 2 \^{x)\^ 


159 


where we have used the Cauchy-Schwarz inequality. The identity 

\ i / j '( x)\ 2 dx = 4 tt 2 f ⑹ I 2 成， 

J — OO 



which holds because of the properties of the Fourier transform and the 
Plancherel formula, concludes the proof of the inequality in the theorem. 

If equality holds, then we must also have equality where we applied the 
Cauchy-Schwarz inequality, and as a result we find that ^ r {x) = (3x 吻 (x) 
for some constant f3. The solutions to this equation are ^(x) = Ae^ x / 2 , 
where A is constant. Since we want 0 to be a Schwartz function, we must 
take /3 = —2B < 0, and since we impose the condition f 二 \^{x)\ 2 dx = 1 
we find that \A\ 2 = y 2S/7T, as was to be shown. 


The precise assertion contained in Theorem 4.1 first came to light in 
the study of quantum mechanics. It arose when one considered the extent 
to which one could simultaneously locate the position and momentum of 
a particle. Assuming we are dealing with (say) an electron that travels 
along the real line, then according to the laws of physics, matters are 
governed by a “state function” which we can assume to be in 5(M), 
and which is normalized according to the requirement that 

/ OO 

|^(^)| 2 dx = 1. 

-oo 

The position of the particle is then determined not as a definite point x\ 
instead its probable location is given by the rules of quantum mechanics 
as follows: 

• The probability that the particle is located in the interval (a, 6) is 
f^\ip(x)\ 2 dx. 

According to this law we can calculate the probable location of the 
particle with the aid of in fact, there may be only a small probability 
that the particle is located in a given interval (a’ ， 6’)，but nevertheless it 
is somewhere on the real line since \^{x)\ 2 dx = 1. 
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In addition to the probability density |^(x)| 2 dx, there is the ex¬ 
pectation of where the particle might be. This expectation is the best 
guess of the position of the particle, given its probability distribution 
determined by \^{x)\ 2 dx^ and is the quantity defined by 


( 12 ) 


X = 



x\^{x )\ 2 dx. 


Why is this our best guess? Consider the simpler (idealized) situation 
where we are given that the particle can be found at only finitely many 
different points, xi,a ； 2 , … ，$iv on the real axis, with 仍 the probability 
that the particle is at Xi, and pi + 仍 + ... + Pn = 1 - Then, if we knew 
nothing else, and were forced to make one choice as to the position of the 
particle, we would naturally take x = 工供， which is the appropriate 
weighted average of the possible positions. The quantity (12) is clearly 
the general (integral) version of this. 

We next come to the notion of variance, which in our terminology is 
the uncertainty attached to our expectation. Having determined that 
the expected position of the particle is x (given by (12)), the resulting 
uncertainty is the quantity 


(13) 


(x-x) 2 \-ip(x)\ r ^ 


dx. 


Notice that if # is highly concentrated near x, it means that there is a 
high probability that x is near x, and so (13) is small, because most of 
the contribution to the integral takes place for values of x near x. Here 
we have a small uncertainty. On the other hand, if ^{x) is rather flat 
(that is, the probability distribution \^{x)\ 2 dx is not very concentrated), 
then the integral (13) is rather big, because large values of (x — x ) 2 will 
come into play, and as a result the uncertainty is relatively large. 

It is also worthwhile to observe that the expectation x is that choice 
for which the uncertainty — x) 2 |^(x)| 2 dx is the smallest. Indeed, 

if we try to minimize this quantity by equating to 0 its derivative with 
respect to x, we find that 2 J 二 (x - x)\^{x )\ 2 dx = 0, which gives (12). 

So far, we have discussed the “expectation” and “uncertainty” related 
to the position of the particle. Of equal relevance are the corresponding 
notions regarding its momentum. The corresponding rule of quantum 
mechanics is: 


• The probability that the momentum ^ of the particle belongs to 
the interval (a, b) is | 矽 (C )| 2 炎 where 必 is the Fourier transform 

of 水 









Ibookroot October 20, 2007 


5. Exercises 161 

Combining these two laws with Theorem 4.1 gives 1/16 丌 2 as the lower 
bound for the product of the uncertainty of the position and the uncer¬ 
tainty of the momentum of a particle. So the more certain we are about 
the location of the particle, the less certain we can be about its mo¬ 
mentum, and vice versa. However, we have simplified the statement of 
the two laws by rescaling to change the units of measurement. Actually, 
there enters a fundamental but small physical number h called Planck’s 
constant. When properly taken into account, the physical conclusion is 

(uncertainty of position) x (uncertainty of momentum) > h/16n 2 . 

5 Exercises 

1. Corollary 2.3 in Chapter 2 leads to the following simplified version of the 
Fourier inversion formula. Suppose / is a continuous function supported on an 
interval [—M, M], whose Fourier transform / is of moderate decrease. 

(a) Fix L with L/2 > M, and show that f(x) = a n (L)e 27rinx , L where 

1 i 

a n {L) = - / f(x)e_ 2 爾 ♦ dx = -f{n/L). 

L J-L/2 ^ 

Alternatively, we may write f(x) — S X]^L-oo f (nS)e 27rin6x with 8 — 1/L. 

(b) Prove that if F is continuous and of moderate decrease, then 

oo 

F ⑹吡 =lim 5 ^2 F(6n). 

5 > 0 n=—oo 

nOO 

(c) Conclude that f(x ) 二 /(^)e 27rzx ^ d^. 

J —oo 

[Hint: For (a), note that the Fourier series of / on [—L/2, L/2] converges ab¬ 
solutely. For (b), first approximate the integral by J^ N F and the sum by 
<5 X^|n|<Ar /(5 -^ (^) - Then approximate the second integral by Riemann sums.] 

2. Let / and g be the functions defined by 

f( x )^ Y[ = 1 1 if 1^1 ^ ^ an d 1 ~ W if - ^ 

_ X[-i,i ]\ x ) — j 0 otherwise, and p\ 0 otherwise. 

Although / is not continuous, the integral defining its Fourier transform still 
makes sense. Show that 

sin2 < , ( sinyrA 2 

m = and 泌 ) = (^ r )， 
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with the understanding that /(0) = 2 and g(0) = 1. 

3. The following exercise illustrates the principle that the decay of / is related 
to the continuity properties of /. 

(a) Suppose that / is a function of moderate decrease on R whose Fourier 
transform / is continuous and satisfies 

/(0 = O as - 。o 

for some 0 < a < 1. Prove that / satisfies a Holder condition of order a, 
that is, that 

\f{x h) — f(x)\ < M\h\ a for some M > 0 and all /i G M. 

(b) Let / be a continuous function on R which vanishes for \x\ > 1, with 
/(0) = 0, and which is equal to 1/log(l/|x|) for all x in a neighborhood 
of the origin. Prove that / is not of moderate decrease. In fact, there is 
no e > 0 so that f ⑹二 0(l/|^| 1+€ ) as |^| ^ oo. 

[Hint: For part (a), use the Fourier inversion formula to express f(x h) — f(x) 
as an integral involving /, and estimate this integral separately for ^ in the two 
ranges |^| < l/\h\ and |$| > l/\h\.} 

4. Bump functions. Examples of compactly supported functions in <S(R) are 
very handy in many applications in analysis. Some examples are: 

(a) Suppose a < 6, and / is the function such that f(x) = 0ifx<aorx>6 
and 

f(x) = e _i/h_a) e _i/(6_o0 if a < a ； < 6. 

Show that / is indefinitely differentiable on R. 

(b) Prove that there exists an indefinitely differentiable function F on R such 
that F(x) = 0 if x < a, F(x) — 1 if x > and F is strictly increasing on 

[M]. 

(c) Let ^ > 0 be so small that a S < b — 8. Show that there exists an indef¬ 
initely differentiable function g such that ^ is 0 if a: < a or a: > 6, ^ is 1 on 
[a 5,b — 5], and g is strictly monotonic on [a, a + and [b — 5 : b]. 

[Hint: For (b) consider F(x) = c J:% f (t) dt where c is an appropriate constant.] 

5. Suppose / is continuous and of moderate decrease. 

(a) Prove that / is continuous and / ⑹— » 0 as |^| —^ oo. 
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(b) Show that if /(0 = 0 for all then / is identically 0. 

[Hint: For part (a), show that /(^) = \ /^[/(rc) — f(x— l/(2^))]e _27rzx ^ dx. 
For part (b), verify that the multiplication formula J f(x)g(x) dx — j f(y)g(y) dy 
still holds whenever g G <S(IR).] 

2 

6. The function e~ nx is its own Fourier transform. Generate other functions 
that (up to a constant multiple) are their own Fourier transforms. What must 
the constant multiples be? To decide this, prove that J 74 = I. Here ^(/) = / 
is the Fourier transform, — ToToToT^ and I is the identity operator 

= f(x) (see also Problem 7). 

7. Prove that the convolution of two functions of moderate decrease is a function 
of moderate decrease. 

[Hint: Write 


/(x - y)g{y) dy : 


\y\<\x\/2 J\y\>\x\/2 


In the first integral f(x — y) — 0(1/(1 + ^ 2 )) while in the second integral 
g(y) = 0(l/(l + x 2 )).} 


8. Prove that / is continuous, of moderate decrease, and J^ oo f(y)e~ y2 e 2xy dy =0 
for all x G M, then / = 0. 

[Hint: Consider / * e~ x2 .] 


9. If / is of moderate decrease, then 


(14) 



/ ⑹ =(/*〜) ⑷, 


where the Fejer kernel on the real line is defined by 


7 丑 ⑷ = 



if t / 0, 
if t = 0. 


Show that {Tr} is a family of good kernels as ^ oo, and therefore (14) tends 
uniformly to f(x) as R ^ oo. This is the analogue of Fejer’s theorem for Fourier 
series in the context of the Fourier transform. 


10. Below is an outline of a different proof of the Weierstrass approximation 
theorem. 
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Define the Landau kernels by 


(工） 


(i-x 2 Y 

Cn 


if-1 <x< 1, 

if |^| > 1, 


where c n is chosen so that L n (x) dx = 1. Prove that {L n } n >o is a family 
of good kernels as n —>• oo. As a result, show that if / is a continuous func¬ 
tion supported in [—1/2,1/2], then (/ * L n ){x) is a sequence of polynomials on 
[—1/2,1/2] which converges uniformly to /. 

[Hint: First show that c n > 2/(n + 1).] 


11. Suppose that u is the solution to the heat equation given by u — f ^Tit 
where / G t5(R). If we also set u{x^ 0) = /(x), prove that u is continuous on the 
closure of the upper half-plane, and vanishes at infinity, that is, 

u(x, t) —> 0 as \x\-\-t ^ oo. 

[Hint: To prove that u vanishes at infinity, show that (i) \u(x,t)\ < C/\/t and (ii) 
\u(x,t)\ < (7/(1 + \x\ 2 ) + (7 广 " 2 e _cx2 /' Use (i) when |ar| <t, and (ii) other¬ 
wise.] 


12 . Show that the function defined by 

u(x,t) = -H t {x) 

satisfies the heat equation for t > Q and lim t _^o u(x, t) = 0 for every x, but u is 
not continuous at the origin. 

[Hint: Approach the origin with (x, t) on the parabola x 2 /At — c where c is a 
constant.] 


13. Prove the following uniqueness theorem for harmonic functions in the strip 
{(x, y) : 0 < ?/ < 1, —oo < rr < oo}: if u is harmonic in the strip, continuous on 
its closure with u(x : 0) = u(x, 1) = 0 for all a; G M, and u vanishes at infinity, 
then u — 0. 


14. Prove that the periodization of the Fejer kernel !Fn on the real line (Exer¬ 
cise 9) is equal to the Fejer kernel for periodic functions of period 1. In other 
words, 


^ T N {x + n) = F n (x), 


when TV > 1 is an integer, and where 


F n (x) 


N 




1 sm 2 (N7Tx) 
N sin 2 (ttx) 
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15. This exercise provides another example of periodization. 

(a) Apply the Poisson summation formula to the function g in Exercise 2 to 
obtain 


E 


1 


(n + a) 2 (sin7TQ；) 2 


whenever a is real, but not equal to an integer, 
(b) Prove as a consequence that 


(15) 


E 


(n + a) tan na 


whenever a is real but not equal to an integer. [Hint: First prove it when 
0 < a < 1. To do so, integrate the formula in (b). What is the precise 
meaning of the series on the left-hand side of (15)? Evaluate at a = 1/2.] 

16. The Dirichlet kernel on the real line is defined by 

[d^=(f* V R ){x) so that V R (x) = xl 二 ^] ( 工 ） = sm ( 27rifa: ) . 

J-R nX 

Also, the modified Dirichlet kernel for periodic functions of period 1 is defined 
by 


d * n ( x ) = ^2 

\n\<N-l 


2ninx . _ / —2niNx t ^2 tviNx \ 
6 ^ 


+ e 2niNx ). 


Show that the result in Exercise 15 gives 

oo 

^ V N {x + n) = D* n (x), 

n=—oo 

where > 1 is an integer, and the infinite series must be summed symmetrically. 
In other words, the periodization oiT>N is the modified Dirichlet kernel D* N . 

17. The gamma function is defined for s > 0 by 


r(s) = / e^x 8 - 1 dx. 


(a) Show that for s 〉 0 the above integral makes sense, that is, that the 
following two limits exist: 


lim 


e -x x s _ L dx and 


lim 

A ― ^oo 


e~ x x s ~ x dx. 
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(b) Prove that r(s + 1) = sr(s) whenever s > 0, and conclude that for every 
integer n > 1 we have r(n + 1) = n\. 

(c) Show that 

r (^) = ^ and r (CH. 


[Hint: For (c), use f: e~ nx2 dx = 1.] 

18. The zeta function is defined for s > 1 by C(s) = l/n s . Verify the 

identity 


7r _s / 2 r(s/2)C(s) = - / t- _1 ( 以 (t) — 1) dt whenever s > 1 

2 Jo 

where T and ^ are the gamma and theta functions, respectively: 


r ⑷ =/ e _ 

Jo 


t t s ~ 1 dt and W(s) = e~ 


More about the zeta function and its relation to the prime number theorem can 
be found in Book II. 


19. The following is a variant of the calculation of C(2m) = E 二 1 1/ 几 2m found 
in Problem 4, Chapter 3. 

(a) Apply the Poisson summation formula to f(x) = t/{^{x 2 + 1 2 )) 
and /($) = e - 2?r£ l^l where t > 0 in order to get 


1 t 

7r t 2 + n 2 

n=—oo 


oo 

E 


e -2nt\n\ 


(b) Prove the following identity valid for 0 < t < 1: 


—+ - V(-l) m+1 C(2rn)t 2m - 1 


^ t 2 + n 2 irt 丌 ^ 


as well as 


- 27 rt|n| _ Z 

/ j 1 — g—27r£ 
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(c) Use the fact that 


z 


e z -1 


! X 、 B2m 2m 

2 土喊， 
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where are the Bernoulli numbers to deduce from the above formula, 


2C(2m) = (-l) m+1 


(2?r) 2m 

(2m)! 


B 2 , 


20. The following results are relevant in information theory when one tries to 
recover a signal from its samples. 

Suppose / is of moderate decrease and that its Fourier transform / is sup¬ 
ported in 7 = [—1/2,1/2]. Then, / is entirely determined by its restriction to 
Z. This means that if g is another function of moderate decrease whose Fourier 
transform is supported in I and f(n) = g(n) for all n G Z, then f — g. More 
precisely: 

(a) Prove that the following reconstruction formula holds: 


/(*) = ^2 f(. n ) K ( x ~ n ) 


where K(y) — 


sin ny 


Note that K(y) = 0(l/\y\) as \y\ oo. 

(b) If A > 1, then 


/o) 



where K\(y) — 


cos ny — cos nXy 
?r 2 (A- l)y 2 


Thus, if one samples / “more often,” the series in the reconstruction 
formula converges faster since K\(y) = 0(l/\y\ 2 ) as \y\ oo. Note that 
K\{y) K(y) as A ^ 1. 


(c) Prove that / |/(x)| 2 dx =E i/wi 


[Hint: For part (a) show that if x is the characteristic function of /, then 
/ ⑹ =S^L-oo f ( n )e -27rin 恙 . For (b) use the function in Figure 2 instead 

of X ⑹ .1 


21. Suppose that / is continuous on M. Show that / and / cannot both be 
compactly supported unless / = 0. This can be viewed in the same spirit as the 
uncertainty principle. 
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[Hint: Assume / is supported in [0,1/2]. Expand / in a Fourier series in the 
interval [0,1], and note that as a result, / is a trigonometric polynomial.] 


22. The heuristic assertion stated before Theorem 4.1 can be made precise as 
follows. If F is a function on R, then we say that the preponderance of its mass 
is contained in an interval I (centered at the origin) if 

(16) / ^ 2 |^(^)| 2 dx > l - x 2 \F(x )\ 2 dx. 

J i 2 J R 


Now suppose / G <S, and (16) holds with F — f and I — also with F — f and 
I 二 I 2 . Then if Lj denotes the length of Ij, we have 


L1L2 > 


2tt 


A similar conclusion holds if the intervals are not necessarily centered at the 
origin. 


23. The Heisenberg uncertainty principle can be formulated in terms of the 
operator L —— 基 ■ + 工 2 ， which acts on Schwartz functions by the formula 


L ⑴二 


dx 2 



This operator, sometimes called the Hermite operator, is the quantum ana¬ 
logue of the harmonic oscillator. Consider the usual inner product on S given 
by 


(f,g) = / f(x)g(x) dx 


whenever f,g^S. 
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(a) Prove that the Heisenberg uncertainty principle implies 

(i/,/)>(/,/) for all feS. 

This is usually denoted by L > I. [Hint: Integrate by parts.] 

(b) Consider the operators A and A* defined on S by 

Mf ) 二 and ^*(/) = + xf. 

The operators A and A* are sometimes called the annihilation and cre¬ 
ation operators, respectively. Prove that for all f ， gG<S we have 

(i) (Af,g) = (f,A*g), 

(ii) (Af,Af) = (A*Af,f)>0, 

(iii) A*A = L-I. 

In particular, this again shows that L > I. 

(c) Now for t G M, let 

At(f) 二 ^ + txf and = —士 txf. 

Use the fact that /) > 0 to give another proof of the Heisenberg 

uncertainty principle which says that whenever \f(x)\ 2 dx = 1 then 



x 2 \f(x)\ 2 dx 



df 


dx 



> 1/4. 


[Hint: Think of f) as a quadratic polynomial in t.] 


6 Problems 


1. The equation 
(17) 


l d 2 u 

dx 2 


du 

+ax Wx 


du 

~dt 


with u(x, 0) = f(x) for 0 < x < oo and t > 0 is a variant of the heat equation 
which occurs in a number of applications. To solve (17), make the change of vari¬ 
ables x = e~ y so that —oo < y < oo. Set U{y^t) — u(e -y , t) and 
F(y) = f(e_ y ). Then the problem reduces to the equation 


d 2 U ^ 、dU dU 

w +{ a) ^ = ^ 
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with U(y^ 0) = F(y). This can be solved like the usual heat equation (the case 
a 二 1) by taking the Fourier transform in the y variable. One must then compute 
the integral J: e (-47r 2 4 2 +(i- Q )2 T i5)t e 2^« g how that the solution of the 
original problem is then given by 


u(x : t) 


e -(log(i;/x)+(l-o)t) 2 /(4t) j ⑼土 


(47rt) 1 / 2 , 


2. The Black- S choles equation from finance theory is 


(18) 


dV dV a 2 s 2 d 2 V ^ 

-di +rs ^ + ^^- rV=0 ^ 


0 < t < T, 


subject to the “final” boundary condition V (s, T) = F(s). An appropriate change 
of variables reduces this to the equation in Problem 1. Alternatively, the substi¬ 
tution V(s,t) = e ax+br U(x,r) where x — logs, r — ^-(T — t), a = \ — and 

6 = — (I + ^2 ) 2 reduces (18) to the one-dimensional heat equation with the ini¬ 
tial condition U(x, 0) = e~ ax F(e x ). Thus a solution to the Black-Scholes equa¬ 
tion is 


V(s,t) 


o~r{T~t) 


^2t:g 2 (T — t) Jo 


) (log(s/s*) + (r- t 7 2 /2)(T-t)) 2 

e F(s*)ds*. 


3. * The Dirichlet problem in a strip. Consider the equation Au = 0 in the 
horizontal strip 

{(x, y) : 0 < y < 1, —oo < x < oo} 

with boundary conditions u(x, 0) = fo{ x ) and u(x, 1) = where /o and /i 

are both in the Schwartz space. 

(a) Show (formally) that if w is a solution to this problem, then 

u(^y) = A(0e 2 ^ + B(0e~ 2 ^ v . 

Express A and B in terms of /o and /i, and show that 


峨， y) 


sinh(27r(l — y)^) 
sinh(27r^) 


/o(0 + 


sinh(27ry^) 

sinh(27r^) 


m . 


(b) Prove as a result that 


\u(x : y) - fo(x)\ 2 dx 0 as " — 0 


and 


\u{x,y) - fi{x)\ 2 dx 0 as y — 1. 
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(c) If = (sinh 2na^) / (sinh 2n ^), with 0 < a < 1, then is the Fourier 
transform of (p where 

sin na 1 

(pix )= —-— . ——-- . 

2 cosh 7tx + cos na 

This can be shown, for instance, by using contour integration and the 
residue formula from complex analysis (see Book II, Chapter 3). 

(d) Use this result to express u in terms of Poisson-like integrals involving /o 
and fi as follows: 




sin Try 





cosh nt — cos iry 


dt + 


cosh irt + cos Try 


dt 


(e) Finally, one can check that the function u(x, y) defined by the above ex¬ 
pression is harmonic in the strip, and converges uniformly to /o(x) as 
2 / —> 0, and to fi(x) as y 1. Moreover, one sees that u(x, y) vanishes at 
infinity, that is, limi^i^oo u(x, y) = 0, uniformly in y. 


In Exercise 12, we gave an example of a function that satisfies the heat equation 
in the upper half-plane, with boundary value 0, but which was not identically 0. 
We observed in this case that u was in fact not continuous up to the boundary. 

In Problem 4 we exhibit examples illustrating non-uniqueness, but this time with 
continuity up to the boundary i = 0. These examples satisfy a growth condition 
at infinity, namely \u(x,t)\ < Ce cx e , for any e > 0. Problems 5 and 6 show 
that under the more restrictive growth condition \u(x,t)\ < Ce cx , uniqueness does 
hold. 

4.* If ^ is a smooth function on R, define the formal power series 

OO 2n 

(19) = 

n=0 V V 

(a) Check formally that u solves the heat equation. 

(b) For a > 0, consider the function defined by 


g(t)= 



if t>0 
if t < 0. 


One can show that there exists 0 < ^ < 1 depending on a so that 


fori>0 . 


(c) As a result, for each x and t the series (19) converges; u solves the heat 
equation; u vanishes for t = 0; and u satisfies the estimate \u(x,t)\ < 
(7e c l x l 2a/(a 1} for some constants C, c > 0. 
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(d) Conclude that for every e > 0 there exists a non-zero solution to the heat 
equation which is continuous for ^ G M and t > 0, which satisfies u(x, 0)= 
0 and \u(x,t)\ < Ce c ^ 2+e . 

5.* The following “maximum principle” for solutions of the heat equation will 
be used in the next problem. 

Theorem. Suppose that u(x ， t) is a real-valued solution of the heat equation 
in the upper half-plane, which is continuous on its closure. Let R denote the 
rectangle 

R — {(x, ?/) G M 2 : a < x <b, 0 <t < c} 

and d r R be the part of the boundary of R which consists of the two vertical sides 
and its base on the line t = 0 (see Figure 3). Then 

min u(x, t) = min u(x,t) and max u(x,t) = max u(x^ t). 

(x ， t) Gd’ R (x,t)Gd f R (x,t)GR 



d，R 


Figure 3. The rectangle R and part of its boundary d'R 


The steps leading to a proof of this result are outlined below. 

(a) Show that it suffices to prove that if > 0 on d’R, then w > 0 in 

(b) For e > 0, let 

v(x,t) = u(x,t) + et. 

Then, v has a minimum on R, say at (xi,ti). Show that x\ — a or b, 
or else ti = 0. To do so, suppose on the contrary that a < x\ <b and 
0 < < c, and prove that v xx (xi^ti) — < —e. However, show 

also that the left-hand side must be non-negative. 

(c) Deduce from (b) that u(x^t) > e(t\ — t) for any (x,t) G R and let e —> 0. 

6.* The examples in Problem 4 are optimal in the sense of the following unique¬ 
ness theorem due to TychonofF. 
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Theorem. Suppose u(x, t) satisfies the following conditions: 

(i) u(x, t) solves the heat equation for all x and and all t > 0. 

(ii) u(x^t) is continuous for a// x G M and 0 <t < c. 

(iii) u(x, 0) = 0. 

(iv) \u(x : t)\ < Me ax2 for some M, a, and all x 0 < t < c. 

Then u is identically equal to 0. 

7. * The Hermite functions hk(x) are defined by the generating identity 

= e ~{x 2 /2-2tx+t 2 ) ^ 

fc =0 ' 

(a) Show that an alternate definition of the Hermite functions is given by the 
formula 

[Hint: Write e -( x2 / 2 ~ 2tx + t2 ) — e x 2 / 2 e -(x-t) 2 an d use Taylor’s formula.] 
Conclude from the above expression that each hk(x) is of the form 
Pk(x)e~ x / 2 , where Pk is a polynomial of degree k. In particular, the Her¬ 
mite functions belong to the Schwartz space and = e~ x / 2 , 

hi(x) = 2xe _x2 / 2 . 

(b) Prove that the family is complete in the sense that if / is a 

Schwartz function, and 

POO 

(/, hk) = / f{x)hk{x) dx = 0 for all fe > 0, 

J —oo 

then / = 0. [Hint: Use Exercise 8.] 

(c) Define h^{x) — /ifc((27r) 1 / 2 x). Then 

%) = H) k K(0- 

Therefore, each h* k is an eigenfunction for the Fourier transform. 

(d) Show that hk is an eigenfunction for the operator defined in Exercise 23, 
and in fact, prove that 

Lhk = (2k + l)hk. 

In particular, we conclude that the functions hk are mutually orthogonal 
for the L 2 inner product on the Schwartz space. 

(e) Finally, show that f^^hk^x^dx = 7r 1 ^ 2 2 k k\. [Hint: Square the generat¬ 
ing relation.] 
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8.* To refine the results in Chapter 4, and to prove that 


oo 

f a (x) = 2~ na e 2ni2nx 

n=0 

is nowhere differentiable even in the case qj = 1, we need to consider a variant of 
the delayed means Ajv, which in turn will be analyzed by the Poisson summation 
formula. 


(a) Fix an indefinitely differentiable function $ satisfying 


m = 


when |^| < 1, 
when j^| > 2. 


By the Fourier inversion formula, there exists E S so that 分 ( 《） = 少 (0- 
Let ^Pn 、 x ) 二 Nip(Nx) so that ^Jv(^) = Finally, set 


E ^Pn(^ + n). 

n=—oo 

Observe by the Poisson summation formula that An(x )= 
E^L-oo ^{ n /N 、 e 2irinx ， thus A at is a trigonometric polynomial of degree 
< 2N, with terms whose coefficients are 1 when |n| < N. Let 

Aat(/) = / * A AT. 

Note that 

SN(fa) = 

where N f is the largest integer of the form 2 k with N f < N. 

(b) If we set An(x) = fiv ⑷ + En(x) where 

E N (x) = ^ 2 卽(工 + 几)， 

|n| 仝 1 

then one sees that: 

(i) sup |；E 匕 1/2 I 五 Jv(x)| — 0 as TV — oo. 

(ii) lA^^)! < cN 2 . 

(iii) |A^(aj)| < c/(N\x\ 3 ), for \x\ < 1/2. 

Moreover, J^| <;L / 2 A^-(x) dx = 0, and — /| x | <;L / 2 xA f N (x) dx —^ 1 as TV —> 
oo. — — 

(c) The above estimates imply that if /’(z 。） exists, then 

(/* -\-h N ) /’Oo) as TV ^ oo, 

whenever |/ijv| ^ C/N. Then, conclude that both the real and imaginary 
parts of /i are nowhere differentiable, as in the proof given in Section 3, 
Chapter 4. 
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It occurred to me that in order to improve treatment 
planning one had to know the distribution of the at¬ 
tenuation coefficient of tissues in the body. This in¬ 
formation would be useful for diagnostic purposes and 
would constitute a tomogram or series of tomograms. 

It was immediately evident that the problem was 
a mathematical one. If a fine beam of gamma rays 
of intensity Io is incident on the body and the emerg¬ 
ing density is /, then the measurable quantity g equals 
log(io/J) = f L f ds, where / is the variable absorption 
coefficient along the line L. Hence if / is a function of 
two dimensions, and g is known for all lines intersect¬ 
ing the body, the question is, can / be determined if 
g is known? 

Fourteen years would elapse before I learned that 
Radon had solved this problem in 1917. 

A. M. Cormack, 1979 


The previous chapter introduced the theory of the Fourier transform 
on M and illustrated some of its applications to partial differential equa¬ 
tions. Here, our aim is to present an analogous theory for functions of 
several variables. 

After a brief review of some relevant notions in we begin with some 
general facts about the Fourier transform on the Schwartz space S(R d ). 
Fortunately, the main ideas and techniques have already been considered 
in the one-dimensional case. In fact, with the appropriate notation, the 
statements (and proofs) of the key theorems, such as the Fourier inversion 
and Plancherel formulas, remain unchanged. 

Next, we highlight the connection to some higher dimensional prob¬ 
lems in mathematical physics, and in particular we investigate the wave 
equation in d dimensions, with a detailed analysis in the cases d = 3 
and d = 2. At this stage, we discover a rich interplay between the Fourier 
transform and rotational symmetry, that arises only in M. d when d > 2. 

Finally, the chapter ends with a discussion of the Radon transform. 
This topic is of substantial interest in its own right, but in addition has 
significant relevance in its application to the use of X-ray scans as well 
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as to other parts of mathematics. 

1 Preliminaries 

The setting in this chapter will be the vector space 1 of all d-tuples of 
real numbers (xi, ..., Xd) with Xi G M. Addition of vectors is component¬ 
wise, and so is multiplication by real scalars. Given x = (a；i, ..., Xd) G M. d 
we define 

\x\^(xl + --- + x 2 d ) 1/2 , 

so that \x\ is simply the length of the vector x in the usual Euclidean 
norm. In fact, we equip W 1 with the standard inner product defined by 

x .y = x x y\ ^ - h x d y d , 

so that \x\ 2 = x ■ x. We use the notation x _ y in place of (x,y) of Chap¬ 
ter 3. 

Given a d-tuple a = (ai , …， of non-negative integers (sometimes 
called a multi-index), the monomial x a is defined by 

= x ^ 1 x^---x a d d . 

Similarly, we define the differential operator (d/dx) a by 

)^1 

where |a| = ai + • • • + is the order of the multi-index a. 

1.1 Symmetries 

Analysis in R d , and in particular the theory of the Fourier transform, is 
shaped by three important groups of symmetries of the underlying space: 

(i) Translations 

(ii) Dilations 

(iii) Rotations 




iSee Chapter 3 for a brief review of vector spaces and inner products. Here we find it 
convenient to use lower case letters such as x (as opposed to X) to designate points in 
M. d . Also, we use I . I instead of || - || to denote the Euclidean norm. 
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We have seen that translations x t—x + /i, with h E~R d fixed, and dila¬ 
tions x i—> with 5 > 0 , play an important role in the one-dimensional 
theory. In ]R, the only two rotations are the identity and multiplica¬ 
tion by — 1. However, in M. d with d > 2 there are more rotations, and 
the understanding of the interaction between the Fourier transform and 
rotations leads to fruitful insights regarding spherical symmetries. 

A rotation in R. d is a linear transformation i? : which pre¬ 

serves the inner product. In other words, 

• R(ax + by) = aR{x) + bR(y) for all x,y and a, 6 G M. 

• R(x) • R(y) = x ■ y for all x,y ^ R d . 

Equivalently, this last condition can be replaced by \R(x)\ = \x\ for all 
x G or R l = i ? _1 where R l and i ? _1 denote the transpose and inverse 
of i?, respectively . 2 In particular, we have det(i?) = 士 1， where det(i?) is 
the determinant of R. If det(i?) = 1 we say that i? is a proper rotation; 
otherwise, we say that R is an improper rotation. 

Example 1. On the real line M, there are two rotations: the identity 
which is proper, and multiplication by —1 which is improper. 

Example 2. The rotations in the plane M 2 can be described in terms of 
complex numbers. We identify M 2 with C by assigning the point (x, y) 
to the complex number z = x -\- iy. Under this identification, all proper 
rotations are of the form z i—>• ze 2ip for some G M, and all improper rota¬ 
tions are of the form 2 ： t—^ ze lip for some G M (here, z = x — iy denotes 
the complex conjugate of z). See Exercise 1 for the argument leading to 
this result. 

Example 3. Euler gave the following very simple geometric description 
of rotations in K. 3 . Given a proper rotation i?, there exists a unit vector 
7 so that: 

(i) R fixes 7 , that is, i?( 7 ) = 7 . 

(ii) HV denotes the plane passing through the origin and perpendicular 
to 7 , then R : V — V, and the restriction of i? to P is a rotation 
in M 2 . 


2 Recall that the transpose of a linear operator A : M. d is the linear operator 

B :R d which satisfies A{x) . y = x • B(y) for all rr, y E We write B = A 1 . The 

inverse of A (when it exists) is the linear operator C : M. d —> with AoC = CoA = I 
(where I is the identity), and we write C = A -1 . 
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Geometrically, the vector 7 gives the direction of the axis of rotation. A 
proof of this fact is given in Exercise 2. Finally, if R is improper, then 
—R is proper (since in M 3 det(—i?) = — det(i?)), so R is the composition 
of a proper rotation and a symmetry with respect to the origin. 

Example 4. Given two orthonormal bases {ei, …， e^} and {e’ l 5 …， e^} 
in R d , we can define a rotation R by letting R(ei) = e: for z = 1,..., d. 
Conversely, if i? is a rotation and {ei, … ,e^} is an orthonormal basis, 
then {e^,..., e^}, where e^- = i?(e』)，is another orthonormal basis. 

1.2 Integration on 

Since we shall be dealing with functions on we will need to discuss 
some aspects of integration of such functions. A more detailed review of 
integration on is given in the appendix. 

A continuous complex-valued function / on M. d is said to be rapidly 
decreasing if for every multi-index a the function \x a f(x)\ is bounded. 
Equivalently, a continuous function is of rapid decrease if 


sup \x\ k \f(x)\ < 00 for every fc = 0, 1,2 ,.... 
xeR d 


Given a function of rapid decrease, we define 



where Qn denotes the closed cube centered at the origin, with sides of 
length N parallel to the coordinate axis, that is, 


Qn = {x : \xi\ < N/2 for i = 1,..., d}. 


The integral over Q^f is a multiple integral in the usual sense of Riemann 
integration. That the limit exists follows from the fact that the integrals 
In = Jq n /(^) dx form a Cauchy sequence as N tends to infinity. 

Two observations are in order. First, we may replace the square Qn 
by the ball Bn = {x G : \x\ < N} without changing the definition. 
Second, we do not need the full force of rapid decrease to show that the 
limit exists. In fact it suffices to assume that / is continuous and 


sup \x\ d+e \f(x)\ < 00 for some e > 0. 
xeR d 
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For example, functions of moderate decrease on M correspond to e = 1. 
In keeping with this we define functions of moderate decrease on 
as those that are continuous and satisfy the above inequality with e = 1 • 

The interaction of integration with the three important groups of sym¬ 
metries is as follows: if / is of moderate decrease, then 





Polar coordinates 

It will be convenient to introduce polar coordinates in and find the 
corresponding integration formula. We begin with two examples which 
correspond to the case d = 2 and d = 3. (A more elaborate discussion 
applying to all d is contained in the appendix.) 

Example 1. In M 2 , polar coordinates are given by (r, 6) with r > 0 and 
0 < 0 < 2n. The Jacobian of the change of variables is equal to r, so that 



Now we may write a point on the unit circle 5 1 as 7 = (cos sin 0 ), and 
given a function g on the circle, we define its integral over S 1 by 



With this notation we then have 



Example 2. In ]R 3 one uses spherical coordinates given by 


x\ = r sin 6 cos cp, 
X 2 = t sin 6 sin (f, 
Xs = r cos 9, 
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where 0<r, O< 0 < 7 r and 0 < (p < 2 tt. The Jacobian of the change of 
variables is r 2 sin 9 so that 

[f(x) dx = 

JR S 

/*27T /*7T /*00 

/ / / /(r sin 6 cos (f, r sin 6 sin r cos 0)r 2 dr sin 9 d6 d^p. 

Jo Jo Jo 

If ^ is a function on the unit sphere S 2 = {x : \x\ = 1}, and 7 = 
(sin 9 cos (p, sin 9 sin cos 0), we define the surface element da(^) by 

n p2tv /*7T 

/ 9 {l) da(-f) = / / g{j) sm.9de dip. 

Js 2 Jo Jo 

As a result, 


/R 3 


dx 


f{r^) r 2 dr da^). 


In general, it is possible to write any point in — {0} uniquely as 

x = 

where 7 lies on the unit sphere 一 1 C R d and r > 0. Indeed, take r = \x\ 
and 7 = x/|x|. Thus one may proceed as in the cases d = 2 or d = 3 to 
define spherical coordinates. The formula we shall use is 


/R d 


f(^) dx 


/(” 7 ) r d ~ x dr dcr( 7 ), 


Is^ 1 Jo 


whenever / is of moderate decrease. Here da( 7 ) denotes the surface 
element on the sphere obtained from the spherical coordinates. 


2 Elementary theory of the Fourier transform 

The Schwartz space <S(]R d ) (sometimes abbreviated as S) consists of 
all indefinitely differentiable functions / on such that 

(dJ 

m I ^ 


sup 

xeM. d 


x ^{d- x ) f{x) 


< 00 , 


for every multi-index a and /3. In other words, / and all its derivatives 
are required to be rapidly decreasing. 
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Example 1. An example of a function in <S(M d ) is the d-dimensional 
Gaussian given by e _7r 卜 I . The theory in Chapter 5 already made clear 
the central role played by this function in the case d = 1. 

The Fourier transform of a Schwartz function / is defined by 

/(0- [ f(x)e~ 2nix ^ dx, for ^ e 
JR d 

Note the resemblance with the formula in one-dimension, except that we 
are now integrating on and the product of x and ^ is replaced by the 
inner product of the two vectors. 

We now list some simple properties of the Fourier transform. In the 
next proposition the arrow indicates that we have taken the Fourier trans¬ 
form, so F{x) —— G ⑹ means that G(^) = F(^). 

Proposition 2.1 Let f e S(R d ). 

(i) f(x + h ) —— > whenever ft G E d . 

(ii) f(x)e~ 2nixh — > /(C + h) whenever heR d . 

(iii) f(6x ) —— > 5 -d /(5 -1 ^) whenever <5 > 0. 

(iv) ( 基 ） /㈤ ~ 、 (2 吨 )°7⑹. 

(v) (~2nix) a f(x) — > ( 纟 ) f(0- 

(vi) f(Rx) —— whenever R is a rotation. 

The first five properties are proved in the same way as in the one¬ 
dimensional case. To verify the last property, simply change variables 
y = Rx in the integral. Then, recall that | det(i?)| = 1, and 
R~ 1 y . g = y . because i? is a rotation. 

Properties (iv) and (v) in the proposition show that, up to factors of 
2tH, the Fourier transform interchanges differentiation and multiplication 
by monomials. This motivates the definition of the Schwartz space and 
leads to the next corollary. 

Corollary 2.2 The Fourier transform maps to itself. 


At this point we disgress to observe a simple fact concerning the in¬ 
terplay between the Fourier transform and rotations. We say that a 
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function / is radial if it depends only on |x|; in other words, / is radial 
if there is a function fo(u), defined for > 0, such that f(x) = fo(\x\). 
We note that / is radial if and only if f(Rx) = f(x) for every rotation 
R. In one direction, this is obvious since |i?x| = |x|. Conversely, suppose 
that f(Rx) = /(x), for all rotations R. Now define /o by 


fo{u) = I 


/( 0 ) 
/ ⑷ 


if ?/ = 0, 
if \x\ = u. 


Note that /o is well defined, since if x and x r are points with \x\ = \x f \ 
there is always a rotation R so that x r = Rx. 

Corollary 2.3 The Fourier transform of a radial function is radial. 

This follows at once from property (vi) in the last proposition. Indeed, 
the condition f(Rx) = f(x) for all R implies that f(R^) = /(^) for all 
R, thus / is radial whenever / is. 

An example of a radial function in is the Gaussian e _7r H 2 . Also, 
we observe that when d = 1, the radial functions are precisely the even 
functions, that is, those for which f{x) = f{—x). 

After these preliminaries, we retrace the steps taken in the previous 
chapter to obtain the Fourier inversion formula and Plancherel theorem 
for R d . 


Theorem 2.4 Suppose f G 5(M d ). Then 

/ ⑷二 / /⑹产以成. 
JR d 


Moreover 



\f{x)\ 2 dx. 


The proof proceeds in the following stages. 

Step 1. The Fourier transform of e~^ x ^ is e~ n ^ 2 . To prove this, 
notice that the properties of the exponential functions imply that 

e ~7T\x\ 2 = an d = e -27T^l-Cl . . . e -27TiXcrCd 


so that the integrand in the Fourier transform is a product of d functions, 
each depending on the variable Xj (I < j < d) only. Thus the assertion 








Ibookroot October 20, 2007 


2. Elementary theory of the Fourier transform 183 

follows by writing the integral over M. d as a series of repeated integrals, 
each taken over ]R. For example, when d = 2, 


e-^ 2 e~ 2nix< dx ^ / e -^ 2 e - 2 ^ 2 -€ 2 / / e -m ? e - 27r _ 心 


m 2 


e -7rxi e -27r^ 2 .6 e -7r^ ^ 


e -^? e ~^2 
P - 吨 I 2 


As a consequence of Proposition 2.1, applied with <5" 2 instead of S, we 
find that (e _7r5 W 2 ) = S~ d ^ 2 e~ 7T ^ 2 ^ 5 . 

Step 2. The family Ks(x) = S~ d ^ 2 e ~ 7T ^ 2 is a family of good kernels 
in M d . By this we mean that 


(i) / K s (x)dx = 1, 
JR d 


(ii) / \Ks(x) \ dx < M (in fact Ks(x) > 0), 
JR d 


(iii) For every rj > 0, 


r \x\>v 


dx —> 0 as 5 —^ 0. 


The proofs of these assertions are almost identical to the case d = 1. As 
a result 


JR d 


Ks(x)F(x) dx F(0) as <5 ^ 0 


when F is a Schwartz function, or more generally when F is bounded 
and continuous at the origin. 

Step 3. The multiplication formula 



f(x)g(x) dx = 



f{y)a{y) dy 


holds whenever / and g are in S. The proof requires the evaluation of the 
integral of f(x)g(y)e~ 27Tlx ' y over (x^y) G M. 2d = x as a repeated 
integral, with each separate integration taken over M d . The justification 
is similar to that in the proof of Proposition 1.8 in the previous chapter. 
(See the appendix.) 

The Fourier inversion is then a simple consequence of the multiplica¬ 
tion formula and the family of good kernels as in Chapter 5. It also 
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follows that the Fourier transform J 7 is a bijective map of <S(IR d ) to itself, 
whose inverse is 



JR d 

Step 4. Next we turn to the convolution, defined by 



(f * 9 )[x ) 二 I f(y)g(x - y) dy, f,g eS. 


We have that f *g e S(R d ), f*g 二 g* f, and (f * g){0 = f(0a(0- 


The argument is similar to that in one-dimension. The calculation of the 
Fourier transform of / * 分 involves an integration of f(y)g(x — y)e~ 27rlx '^ 
(over M. 2d = x expressed as a repeated integral. 

Then, following the same argument in the previous chapter, we obtain 
the d-dimensional Plancherel formula, thereby concluding the proof of 
Theorem 2.4. 

3 The wave equation in x M 

Our next goal is to apply what we have learned about the Fourier trans¬ 
form to the study of the wave equation. Here, we once again simplify 
matters by restricting ourselves to functions in the Schwartz class S. We 
note that in any further analysis of the wave equation it is important to 
allow functions that have much more general behavior, and in particular 
that may be discontinuous. However, what we lose in generality by only 
considering Schwartz functions, we gain in transparency. Our study in 
this restricted context will allow us to explain certain basic ideas in their 
simplest form. 

3.1 Solution in terms of Fourier transforms 

The motion of a vibrating string satisfies the equation 


d 2 u _ 1 d 2 u 
dx 2 c 2 dt 2 


which we referred to as the one-dimensional wave equation. 

A natural generalization of this equation to d space variables is 


⑴ 


d 2 u 


d 2 u 1 d 2 u 
+ = 



In fact, it is known that in the case d = 3, this equation determines the 
behavior of electromagnetic waves in vacuum (with c = speed of light). 
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Also, this equation describes the propagation of sound waves. Thus (1) 
is called the d-dimensional wave equation. 

Our first observation is that we may assume c = 1, since we can rescale 
the variable t if necessary. Also, if we define the Laplacian in d dimen¬ 
sions by 

A d 2 d 2 

dx\ ^ 十 dxY 


then the wave equation can be rewritten as 


( 2 ) 


Au = 


d 2 u 

W 


The goal of this section is to find a solution to this equation, subject 
to the initial conditions 


u(x,0) = f(x) and 


du 

~di 


(x ， 0) = g(x), 


where /, ^ G <S(M d ). This is called the Cauchy problem for the wave 
equation. 

Before solving this problem, we note that while we think of the variable 
t as time, we do not restrict ourselves to t > 0. As we will see, the solution 
we obtain makes sense for all t G M. This is a manifestation of the fact 
that the wave equation can be reversed in time (unlike the heat equation). 

A formula for the solution of our problem is given in the next theorem. 
The heuristic argument which leads to this formula is important since, as 
we have already seen, it applies to some other boundary value problems 
as well. 

Suppose u solves the Cauchy problem for the wave equation. The 
technique employed consists of taking the Fourier transform of the equa¬ 
tion and of the initial conditions, with respect to the space variables 
Xi,..., Xd- This reduces the problem to an ordinary differential equation 
in the time variable. Indeed, recalling that differentiation with respect to 
Xj becomes multiplication by , and the differentiation with respect 
to t commutes with the Fourier transform in the space variables, we find 
that (2) becomes 

-4 ： 7T 2 \^\ 2 u(^t) = t). 

For each fixed ^ G this is an ordinary differential equation in t whose 
solution is given by 


u(^t) = A ⑹ cos(2?r | ⑽ + B(0 sin(27r|^|t) 
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where for each and B ⑹ are unknown constants to be determined 

by the initial conditions. In fact, taking the Fourier transform (in x) of 
the initial conditions yields 

o) = f(0 and 尝 ($0) 二旮 ⑹. 

We may now solve for A ⑹ and S ⑹ to obtain 

A(0 ^ M and 27r|e|B(0 = g(0- 


Therefore, we find that 

um^m cos(2n\m+m) ) , 

and the solution u is given by taking the inverse Fourier transform in 
the ^ variables. This formal derivation then leads to a precise existence 
theorem for our problem. 


Theorem 3.1 A solution of the Cauchy problem for the wave equation 
is 


(3) u(x ， t) 


/R d 


/(0 cos(27r|C|i) +3(0 


sin( 27 r|^|t) 


2兀旧 


27TZCC-^ 


成. 


Proof. We first verify that u solves the wave equation. This is 
straightforward once we note that we can differentiate in x and t un¬ 
der the integral sign (because / and g are both Schwartz functions) and 
therefore u is at least C 2 . On the one hand we differentiate the expo¬ 
nential with respect to the x variables to get 


Au(x, t) 


/R d 


/(0 cos(2 _ ⑽ + 败 ) 


sin( 27 r|^|t) 

27 T 旧 


(-4 tt 2 旧 2 ) e 2 _.« 屯， 


while on the other hand we differentiate the terms in brackets with re¬ 
spect to t twice to get 

d 2 u, 


dt 2 


(x,t) 


/R d 


-4 兀 2 旧 2 / ⑹ cos(2tt| 作 ）- 4 tt 2 |^| 2 5(0 


sin(2 丌旧勿 

2兀旧 


e 


2irix-^ 




This shows that u solves equation (2). Setting 亡 = 0 we get 


u(x, 0) 


/R d 
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by the Fourier inversion theorem. Finally, differentiating once with re¬ 
spect to setting t = 0, and using the Fourier inversion shows that 

尝 (x ， 0 ) 二 g(x)_ 


Thus u also verifies the initial conditions, and the proof of the theorem 
is complete. 

As the reader will note, both /(^) cos(27r|^|t) and g(^) ^ 货卜 ) are 
functions in assuming as we do that / and g are in S. This is be¬ 
cause both cost/ and (sin u)/u are even functions that are indefinitely 
differentiable. 

Having proved the existence of a solution to the Cauchy problem for the 
wave equation, we raise the question of uniqueness. Are there solutions 
to the problem 



subject to u(x, 0) = f(x) 


and 


— (x,0)^g(x), 


other than the one given by the formula in the theorem? In fact the 
answer is, as expected, no. The proof of this fact, which will not be 
given here (but see Problem 3), can be based on a conservation of energy 
argument. This is a local counterpart of a global conservation of energy 
statement which we will now present. 

We observed in Exercise 10, Chapter 3, that in the one-dimensional 
case, the total energy of the vibrating string is conserved in time. The 
analogue of this fact holds in higher dimensions as well. Define the 
energy of a solution by 



du 

~dt 


2 


du 

dx\ 



du 2 」 
‘血. 


Theorem 3.2 If u is the solution of the wave equation given by for¬ 
mula (3), then E{t) is conserved, that is, 

E(t) = E(0), for all t G M. 

The proof requires the following lemma. 


Lemma 3.3 Suppose a and b are complex numbers and a is real. Then 


|acosa + 6sina| 2 + | — a sin a + 6cosa| 2 = |a| 2 + |6| 2 . 
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This follows directly because ei = (cos a, sin a) and = (— sin a, cos a) 
are a pair of orthonormal vectors, hence with Z = (a, b) G C 2 , we have 

\Z\ 2 ^\Z- ei \ 2 + \Z-e 2 \ 2 , 

where - represents the inner product in C 2 . 

Now by PlancherePs theorem, 



—2 雄 |/(0 sm(2n\^\t) + g(^) cos(2tt| ⑽ 


2 

d^. 


Similarly, 



du 



dx= 2ti • 旧 /(Ocos(27rj ⑽ + 娘 ) sin(27r| 啡 ) 

JR d 


狀. 


We now apply the lemma with 


a 二 27i ■ 旧 /($)， b = g[C) and a = 2 兀旧 


The result is that 



du 

2 

du 

2 

du 

dt 

+ 

dxi 

+ ...+ 

dx d 


dx 



(4n 2 \e\m 2 +m)\ 2 )d^ 


which is clearly independent of t. Thus Theorem 3.2 is proved. 

The drawback with formula (3), which does give the solution of the 
wave equation, is that it is quite indirect, involving the calculation of the 
Fourier transforms of / and 仏 and then a further inverse Fourier trans¬ 
form. However, for every dimension d there is a more explicit formula. 
This formula is very simple when d = 1 and a little less so when d = 3. 
More generally, the formula is “elementary” whenever d is odd, and more 
complicated when d is even (see Problems 4 and 5). 

In what follows we consider the cases d = 1, d = 3, and d = 2, which 
together give a picture of the general situation. Recall that in Chapter 1, 
when discussing the wave equation over the interval [0, L], we found that 
the solution is given by d’Alembert’s formula 

"、 f(x-\-t)-\- f(x-t) { 1 [ x+t f w 

(4) u(x,t) = --- + 2 y 9{y) dy. 
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with the interpretation that both / and g are extended outside [0, L] by 
making them odd in [—L,L], and periodic on the real line, with period 
2L. The same formula (4) holds for the solution of the wave equation 
when d = 1 and when the initial data are functions in <S(M). In fact, this 
follows directly from (3) if we note that 

cos(2vr|e|t) = + e- 2 ^) 

and 

sin(2?r|g|t ) 二 1 ( 2 ni\e\t — -27r«|e|t' ) 

27T 旧 ~47Ti|Cr )• 

Finally, we note that the two terms that appear in d’Alembert’s for¬ 
mula (4) consist of appropriate averages. Indeed, the first term is pre¬ 
cisely the average of / over the two points that are the boundary of the 
interval [x — t,x 1]; the second term is, up to a factor of 亡 ， the mean 
value of g over this interval, that is, (l/2t) g(y) dy. This suggests a 
generalization to higher dimensions, where we might expect to write the 
solution of our problem as averages of the initial data. This is in fact the 
case, and we now treat in detail the particular situation d = 3. 

3.2 The wave equation in M 3 x M 

If S 2 denotes the unit sphere in M 3 , we define the spherical mean of 
the function / over the sphere of radius t centered at x by 

(5) M t (f)(x) ^ ^ f(x - tj) 

where d<j( 7 ) is the element of surface area for S 2 . Since 4 丌 is the area 
of the unit sphere, we can interpret Mt(f) as the average value of / over 
the sphere centered at x of radius t. 

Lemma 3.4 /// G <S(M 3 ) and t is fixed, then M t (f) G <S(M 3 ). Moreover, 
Mt(f) is indefinitely differentiable in t, and each t-derivative also belongs 
to <S(M 3 ). 

Proof. Let F(x) = To show that F is rapidly decreasing, 

start with the inequality \f(x)\ < An/(1 + |^|^) which holds for every 
fixed TV > 0. As a simple consequence, whenever t is fixed, we have 

\f(x - -ft)\ < A' n /(1 + |x| w ) for all 7 e S 12 . 
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To see this consider separately the cases when \x\ < 2\t\, and \x\ > 2\t\. 
Therefore, by integration 

|F(>)| 幺 A' N /(l + K), 


and since this holds for every iV, the function F is rapidly decreasing. 
One next observes that F is indefinitely differentiable, and 

( 6 ) F(x) ^ M t (f^)(x) 


where f^ a \x) = (d/dx) a f. It suffices to prove this when (d/dx) a = 
d/dxk^ and then proceed by induction to get the general case. Further¬ 


more, it is enough to take k = 1. Now 

F(xi + h, x 2 , x 3 ) - F(x 1 ,x 2 , x 3 ) 
h 


s j s M l)d(T{ - l) 


where 

eih--it) - 

9hh) = - ^ - , 

and ei = (1,0,0). Now, it suffices to observe that —>• — ^t) 

as /i —^ 0 uniformly in 7 . As a result, we find that ( 6 ) holds, and by 
the first argument, it follows that (^) F(x) is also rapidly decreasing, 
hence F e S. The same argument applies to each t-derivative of 


The basic fact about integration on spheres that we shall need is the 
following Fourier transform formula. 


Lemma 3.5 



e _2< 7 d(7 ( 7 ) 


sin( 2 ?i ■ 旧 ) 

2吨| 


This formula, as we shall see in the following section, is connected to 
the fact that the Fourier transform of a radial function is radial. 

Proof. Note that the integral on the left is radial in Indeed, if R is 
a rotation then 


Is 2 


⑹ . 7 d ( i (7) = / 6- 2 峨丑— 1(7) 咖(7) 


-27ri^-^y 


Is 2 


Is 2 


dcrin) 


because we may change variables 7 ^ i? _ 1 ( 7 ). (For this, see formula (4) 
in the appendix.) So if |^| = p, it suffices to prove the lemma with 
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(=(0,0,p). If p = 0, the lemma is obvious. If p > 0, we choose spherical 
coordinates to find that the left-hand side is equal to 


47T , 


r 2n 


F o 


e~ 27ripcos 6 sinO d9d(f. 


The change of variables u = — cos 6 gives 


r 27T 


e -2^pcos O s[n0d Q d _ / e -2^pcos0 sin0d0 

47 t Jo Jo 2 J 0 

1 r 1 

=- e 2nipu du 

2 J-i 

- 47Ti/ e 」-i 
sin(27rp) 

27rp ’ 

and the formula is proved. 

By the defining formula (5) we may interpret Mt(f) as a convolution 

of the function / with the element da, and since the Fourier transform 

—^■ — 

interchanges convolutions with products, we are led to believe that Mt(f) 
is the product of the corresponding Fourier transforms. Indeed, we have 
the identity 


⑺ 

To see this, write 




mTm 


-27rix-^ 


/R3 


Is 2 


f(x - da(j) dx, 


and note that we may interchange the order of integration and make a 
simple change of variables to achieve the desired identity. 

As a result, we find that the solution of our problem may be expressed 
by using the spherical means of the initial data. 

Theorem 3.6 The solution when d = 3 of the Cauchy problem for the 
wave equation 

dvi 

Au = subject to u(x, 0) = f(x) and —(x, 0) = g(x) 


is given by 


u{x,t) = —(tM t (f){x)) + tM t (g){x). 
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Proof. Consider first the problem 

du 

Au = subject to u(x, 0) = 0 and -^(x, 0) = g(x). 

Then by Theorem 3.1, we know that its solution u\ is given by 




/R3 


m) 


sin(2 丌旧 


/M 3 


m 


2 吨 I 

sin(27r 旧 t) 

2vr 旧 t 


27rix-^ 




e 


2nix-^ 




tM t (g)(x), 


where we have used (7) applied to 仏 and the Fourier inversion formula. 
According to Theorem 3.1 again, the solution to the problem 

du 

Au = - subject to u(x, 0) = f(x) and —(x, 0) = 0 

is given by 




Jr 3 

K '/ R 3 

d 

Ft 


/ ⑹ cos(2 丌 I ⑽ 


2nix-^ 




m 


sin(27r|^|t) 

2tt| 作 


2nix-^ 




We may now superpose these two solutions to obtain u = U 2 the 
solution of our original problem. 


Huygens principle 

The solutions to the wave equation in one and three dimensions are given, 
respectively, by 

f(x-t) 1 f x+t 

= - 2 - + 2y g(y) d y 


and 


U(x ， t) = ^(tM t (/)(x)) + tM t {g){x). 
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We observe that in the one-dimensional problem, the value of the solution 
at (x, t) depends only on the values of / and g in the interval centered 
at x of length 2t, as shown in Figure 1. 

If in addition ^ = 0, then the solution depends only on the data at the 
two boundary points of this interval. In three dimensions, this boundary 
dependence always holds. More precisely, the solution u(x,t) depends 
only on the values of / and g in an immediate neighborhood of the sphere 
centered at x and of radius t. This situation is depicted in Figure 2, where 
we have drawn the cone originating at (x, t) and with its base the ball 
centered at x of radius t. This cone is called the backward light cone 
originating at (x ， t). 


(x,t) 



Figure 2. Backward light cone originating at (x, t) 


Alternatively, the data at a point xq in the plane t = 0 influences the 
solution only on the boundary of a cone originating at xq, called the 
forward light cone and depicted in Figure 3. 

This phenomenon, known as the Huygens principle, is immediate 
from the formulas for u given above. 

Another important aspect of the wave equation connected with these 
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Figure 3. The forward light cone originating at Xq 


considerations is that of the finite speed of propagation. (In the 
case where c = 1, the speed is 1.) This means that if we have an initial 
disturbance localized at x = Xq, then after a finite time t, its effects will 
have propagated only inside the ball centered at xq of radius |t|. To state 
this precisely, suppose the initial conditions / and g are supported in the 
ball of radius 5, centered at Xq (think of 5 as small). Then u(x^t) is 
supported in the ball of radius \t\ + S centered at xq. This assertion is 
clear from the above discussion. 


3.3 The wave equation in R 2 xM: descent 


It is a remarkable fact that the solution of the wave equation in three 
dimensions leads to a solution of the wave equation in two dimensions. 
Define the corresponding means by 

M t (F)(x) = ^~ [ F(x- ty)(l - \y\ 2 )~ 1/2 dy. 

2 丌 J\v\<^ 


Theorem 3.7 A solution of the Cauchy problem for the wave equation 
in two dimensions with initial data f，g 6 <S(IR 2 ) is given by 


( 8 ) 


d 




dt 


Notice the difference between this case and the case d = 3. Here, u at 
(x, t) depends on / and g in the whole disc (of radius \t\ centered at x), 
and not just on the values of the initial data near the boundary of that 
disc. 


Formally, the identity in the theorem arises as follows. If we start 
with an initial pair of functions / and g in 5(M 2 ), we may consider the 
corresponding functions / and g on M 3 that are merely extensions of / 
and g that are constant in the variable, that is, 


f(x 1 ,x 2 ,x 3 ) = f(x 1 ,x 2 ) and g(x 1 ,x 2 ,x 3 ) = g(x 1 ,x 2 ). 
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Now, if u is the solution (given in the previous section) of the 3-dimensional 
wave equation with initial data / and then one can expect that u is 
also constant in so that u satisfies the 2 -dimensional wave equation. 
A difficulty with this argument is that / and g are not rapidly decreasing 
since they are constant in X 3 , so that our previous methods do not apply. 
However, it is easy to modify the argument so as to obtain a proof of 
Theorem 3.7. 

We fix T > 0 and consider a function 7 /(x 3 ) that is in <S(M), such that 
rj(xs) = 1 if |^ 3 | < 3T. The trick is to truncate / and g in the X 3 -variable, 
and consider instead 

f\x 1 ,x 2 ,x s ) = f(x 1 ,x 2 )r](x 3 ) and g\x 1 ,x 2 ,x 3 ) = g(x 1 ,x 2 )r](x s ). 

Now both / b and g° are in <S(M 3 ), so Theorem 3.6 provides a solution 
v? of the wave equation with initial data / b and . It is easy to see 
from the formula that is independent of ^ 3 , whenever |a ； 3 | < T 

and \t\ <T. In particular, if we define n(xi, X 2 , t) = ^ 2 ,0, t), then 

u satisfies the 2-dimensional wave equation when \t\ < T. Since T is 
arbitrary, ia is a solution to our problem, and it remains to see why u has 
the desired form. 

By definition of the spherical coordinates, we recall that the integral 
of a function H over the sphere S 2 is given by 

忐/， (刺 7 卜 

i/(sin 6 cos (f, sin 9 sin cos 0 ) sin 6 d6 dip. 

If H does not depend on the last variable, that is, i7(xi, X 2 , X 3 ) = /i(xi, X 2 ) 
for some function h of two variables, then 

M t (H)(x l ,x 2 ,0 )= 

1 /»27T /»7T 

J J h{x\ — t sin 0 cos —t sin 9 sin (p) sin 6 d6 dip. 

To calculate this last integral, we split the 0-integral from 0 to 丌 /2 and 
then 7 r /2 to n. By making the change of variables r = sin 0, we find, after 
a final change to polar coordinates, that 

M t (H)(xi,X2,0) ^ [ h(x-ty)(l- \y\ 2 )~ 1/2 dy 

2 丌 J\y\<l 

= M t (h)(x 1 ,x 2 ). 
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Applying this to H = / b , h = f, and H = h = g, we find that u is 
given by the formula (8), and the proof of Theorem 3.7 is complete. 

Remark. In the case of general d, the solution of the wave equation 
shares many of the properties we have discussed in the special cases 
d = 1, 2, and 3. 

• At a given time the initial data at a point x only affects the solu¬ 

tion u in a specific region. When d > 1 is odd, the data influences 
only the points on the boundary of the forward light cone origi¬ 
nating at x, while when d = 1 or d is even, it affects all points of 
the forward light cone. Alternatively, the solution at a point (x,t) 
depends only on the data at the base of the backward light cone 
originating at In fact, when d > 1 is odd, only the data in an 

immediate neighborhood of the boundary of the base will influence 
u(x, t). 

• Waves propagate with finite speed: if the initial data is supported 
in a bounded set, then the support of the solution u spreads with 
velocity 1 (or more generally c, if the wave equation is not normal¬ 
ized). 

We can illustrate some of these facts by the following observation about 
the different behavior of the propagation of waves in three and two dimen¬ 
sions. Since the propagation of light is governed by the three-dimensional 
wave equation, if at t = 0 a light flashes at the origin, the following hap¬ 
pens: any observer will see the flash (after a finite amount of time) only 
for an instant. In contrast, consider what happens in two dimensions. If 
we drop a stone in a lake, any point on the surface will begin (after some 
time) to undulate; although the amplitude of the oscillations will decrease 
over time, the undulations will continue (in principle) indefinitely. 

The difference in character of the formulas for the solutions of the 
wave equation when d — 1 and d = 3 on the one hand, and d = 2 on 
the other hand, illustrates a general principle in d-dimensional Fourier 
analysis: a significant number of formulas that arise are simpler in the 
case of odd dimensions, compared to the corresponding situations in even 
dimensions. We will see several further examples of this below. 

4 Radial symmetry and Bessel functions 

We observed earlier that the Fourier transform of a radial function in 
R d is also radial. In other words, if f(x) = / 0 (|x|) for some / 0 , then 
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/(() = F 0 (|^|) for some F 0 . A natural problem is to determine a relation 


between /◦ and Fq. 

This problem has a simple answer in dimensions one and three. If 
d = 1 the relation we seek is 



If we recall that M has only two rotations, the identity and multiplication 
by —1, we find that a function is radial precisely when it is even. Having 
made this observation it is easy to see that if / is radial, and |^| = p, 
then 



Jo 

/•OO 

= 2 / cos(27rpr)/ 0 (r) dr. 

Jo 

In the case d = 3, the relation between /o and Fq is also quite simple 
and given by the formula 

(10) F 0 (p) = 2p _1 [ sm(2n pr)fo(r)r dr. 

Jo 

The proof of this identity is based on the formula for the Fourier trans¬ 
form of the surface element da given in Lemma 3.5: 



More generally, the relation between /◦ and Fq has a nice description 
in terms of a family of special functions that arise naturally in problems 
that exhibit radial symmetry. 

The Bessel function of order n G Z, denoted J n (p), is defined as the 
n th Fourier coefficient of the function e lp sin 6 . So 
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therefore 


e ip sin g Jn(p)e in9 . 


As a result of this definition, we find that when d = 2, the relation be¬ 
tween /o and _Fo is 

(11) F 0 (p) = 2tt [ J 0 (27rrp)f 0 (r)rdr. 

Jo 

Indeed, since /(() is radial we take ^ = (0, —p) so that 


/(0= / f{x)e 2rrix < 0 ^ dx 

: h(r)e 2wirpsin0 r dr d0 


m 2 

r 2ir 


/0 ^0 
/»oo 

2 丌 / Jo{2nrp)f 0 (r)rdr, 


as desired. 

In general, there are corresponding formulas relating /q and Fq in M. d 
in terms of Bessel functions of order d/2 — 1 (see Problem 2). In even 
dimensions, these are the Bessel functions we have defined above. For 
odd dimensions, we need a more general definition of Bessel functions to 
encompass half-integral orders. Note that the formulas for the Fourier 
transform of radial functions give another illustration of the differences 
between odd and even dimensions. When d = 1 or d = 3 (as well as 
d > 3, d odd) the formulas are in terms of elementary functions, but this 
is not the case when d is even. 

5 The Radon transform and some of its applications 

Invented by Johann Radon in 1917, the integral transform we discuss 
next has many applications in mathematics and other sciences, includ¬ 
ing a significant achievement in medicine. To motivate the definitions 
and the central problem of reconstruction, we first present the close con¬ 
nection between the Radon transform and the development of X-ray 
scans (or CAT scans) in the theory of medical imaging. The solution of 
the reconstruction problem, and the introduction of new algorithms and 
faster computers, all contributed to a rapid development of computerized 
tomography. In practice, X-ray scans provide a “picture” of an internal 
organ, one that helps to detect and locate many types of abnormalities. 
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After a brief description of X-ray scans in two dimensions, we define 
the X-ray transform and formulate the basic problem of inverting this 
mapping. Although this problem has an explicit solution in R 2 , it is 
more complicated than the analogous problem in three dimensions, hence 
we give a complete solution of the reconstruction problem only in M 3 . 
Here we have another example where results are simpler in the odd¬ 
dimensional case than in the even-dimensional situation. 

5.1 The X-ray transform in M 2 

Consider a two dimensional object O lying in the plane M 2 , which we 
may think of as a planar cross section of a human organ. 

First, we assume that O is homogeneous, and suppose that a very 
narrow beam of X-ray photons traverses this object. 



If /o and I denote the intensity of the beam before and after passing 
through 0, respectively, the following relation holds: 

I 0 e- dp . 

Here d is the distance traveled by the beam in the object, and p denotes 
the attenuation coefficient (or absorption coefficient), which depends on 
the density and other physical characteristics of O. If the object is not 
homogeneous, but consists of two materials with attenuation coefficients 
Pi and p 2 , then the observed decrease in the intensity of the beam is 
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given by 


J 0 e 


—dipi — d2p2 


where d\ and denote the distances traveled by the beam in each ma¬ 
terial. In the case of an arbitrary object whose density and physical 
characteristics vary from point to point, the attenuation factor is a func¬ 
tion p in R 2 , and the above relations become 

I = I oe L p . 

Here L is the line in R 2 traced by the beam, and f L p denotes the line 
integral of p over L. Since we observe I and Iq, the data we gather 
after sending the X-ray beam through the object along the line L is the 
quantity 



Since we may initially send the beam in any given direction, we may 
calculate the above integral for every line in M 2 . We define the X-ray 
transform (or Radon transform in M 2 ) of p by 

X{p){L)^ 

Note that this transform assigns to each appropriate function p on M 2 
(for example, p G <S(R 2 )) another function X(p) whose domain is the set 
of lines L in M 2 . 

The unknown is p, and since our original interest lies precisely in the 
composition of the object, the problem now becomes to reconstruct the 
function p from the collected data, that is, its X-ray transform. We 
therefore pose the following reconstruction problem: Find a formula for 
p in terms of X(p). 

Mathematically, the problem asks for a formula giving the inverse of 
X. Does such an inverse even exist? As a first step, we pose the following 
simpler uniqueness question: If X(p) = X(p f ), can we conclude that p = 

〆 ？ '' , 

There is a reasonable a priori expectation that X(p) actually deter¬ 
mines p, as one can see by counting the dimensionality (or degrees of 
freedom) involved. A function p on R 2 depends on two parameters (the 
x\ and X 2 coordinates, for example). Similarly, the function X(p), which 
is a function of lines L, is also determined by two parameters (for ex¬ 
ample, the slope of L and its X 2 -intercept). In this sense, p and X(p) 
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convey an equivalent amount of information, so it is not unreasonable to 
suppose that X(p) determines p. 

While there is a satisfactory answer to the reconstruction problem, 
and a positive answer to the uniqueness question in M 2 , we shall forego 
giving them here. (However, see Exercise 13 and Problem 8.) Instead 
we shall deal with the analogous but simpler situation in M 3 . 

Let us finally remark that in fact, one can sample the X-ray trans¬ 
form, and determine X[p)[L) for only finitely many lines. Therefore, 
the reconstruction method implemented in practice is based not only on 
the general theory, but also on sampling procedures, numerical approx¬ 
imations, and computer algorithms. It turns out that a method used 
in developing effective relevant algorithms is the fast Fourier transform, 
which incidentally we take up in the next chapter. 

5.2 The Radon transform in M 3 

The experiment described in the previous section applies in three dimen¬ 
sions as well. If O is an object in M 3 determined by a function p which 
describes the density and physical characteristics of this object, sending 
an X-ray beam through O determines the quantity 



for every line in M 3 . In R 2 this knowledge was enough to uniquely de¬ 
termine p, but in R 3 we do not need as much data. In fact, by using the 
heuristic argument above of counting the number of degrees of freedom, 
we see that for functions p in M 3 the number is three, while the number 
of parameters determining a line L in M 3 is four (for example, two for 
the intercept in the (xi, X 2 ) plane, and two more for the direction of the 
line). Thus in this sense, the problem is over-determined. 

We turn instead to the natural mathematical generalization of the two- 
dimensional problem. Here we wish to determine the function in M 3 by 
knowing its integral over all planes 3 in M 3 . To be precise, when we speak 
of a plane, we mean a plane not necessarily passing through the origin. 
If V is such a plane, we define the Radon transform TZ(f) by 

n(f)(v) = [ f. 

Jv 

To simplify our presentation, we shall follow our practice of assuming 
that we are dealing with functions in the class <S(IR 3 ). However, many 


3 Note that the dimensionality associated with points on R 3 , and that for planes in R 3 , 
equals three in both cases. 
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of the results obtained below can be shown to be valid for much larger 
classes of functions. 

First, we explain what we mean by the integral of / over a plane. The 
description we use for planes in R 3 is the following: given a unit vector 
7 G *S 2 and a number t G M, we define the plane Vt^ by 


Vt,^y = {a; G M 3 : x • 7 = t}. 


So we parametrize a plane by a unit vector 7 orthogonal to it, and by its 
“distance” t to the origin (see Figure 5). Note that and 

we allow t to take negative values. 



Figure 5. Description of a plane in M 3 


Given a function / G S(R d ), we need to make sense of its integral over 
Vt^. We proceed as follows. Choose unit vectors e \, so that ei, e 2,7 is 
an orthonormal basis for M 3 . Then any x G Vt^ can be written uniquely 
as 

x = tj + u where u = u\ei + with U\^U 2 G M. 


If / G 5(R 3 ), we define 


( 12 ) 



^ *，7 


/( 亡 7 + uiei + u 2 e 2 ) dui du 2 . 


To be consistent, we must check that this definition is independent of 
the choice of the vectors ei, e 2 - 
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Proposition 5.1 If f E <S(R 3 )，then for each 7 the definition of f 卩 t f 
is independent of the choice of e\ and e). Moreover 



dt = 


L mdx _ 


Proof. If is another choice of basis vectors so that 7 , 6^62 is 

orthonormal, consider the rotation R in R 2 which takes e\ to and 
e 2 to e' 2 - Changing variables v! = R(u) in the integral proves that our 
definition ( 12 ) is independent of the choice of basis. 

To prove the formula, let R denote the rotation which takes the stan¬ 
dard basis of unit vectors 4 in R 3 to 7 , ei, and e 2 . Then 


L mdx = 



f (Rx) dx 


/ /(^i 7 + x 2 e 1 + x 3 e 2 ) dxi dx 2 dx 3 



Remark. We digress to point out that the X-ray transform deter¬ 
mines the Radon transform, since two-dimensional integrals can be ex¬ 
pressed as iterated one-dimensional integrals. In other words, the knowl¬ 
edge of the integral of a function over all lines determines the integral of 
that function over any plane. 

Having disposed of these preliminary matters, we turn to the study of 
the original problem. The Radon transform of a function / G <S(M 3 ) 
is defined by 

^(/)(^ 7 )= [ /• 

In particular, we see that the Radon transform is a function on the 
set of planes in M 3 . From the parametrization given for a plane, we 
may equivalently think of TZ(f) as a function on the product Rx S 2 = 
{(t ， 7 ) : t G M, 7 G 5 2 }, where S 2 denotes the unit sphere in R 3 . The 
relevant class of functions on M x 5 2 consists of those that satisfy the 
Schwartz condition in t uniformly in 7 . In other words, we define x 
*S 2 ) to be the space of all continuous functions F(t, 7 ) that are indefinitely 


4 Here we are referring to the vectors (1, 0,0), (0,1, 0), and (0,0,1). 
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differentiable in and that satisfy 
d e F 


sup \t\ k 

teu, 7 es 2 


w 


(^t) 


< oo for all integers fc, £ > 0 . 


Our goal is to solve the following problems. 

Uniqueness problem: If TZ(f) = TZ(g), then f = g. 


Reconstruction problem: Express / in terms of TZ{f). 

The solutions will be obtained by using the Fourier transform. In fact, 
the key point is a very elegant and essential relation between the Radon 
and Fourier transforms. 

Lemma 5.2 If f G <S(E 3 ) ; thenTZ(f)(t,^) G S(M) for each fixed 7 . More¬ 
over, 

免⑺ 0,7) = / Ot ). 

To be precise, / denotes the (three-dimensional) Fourier transform 
of /, while TZ(f)(s, 7 ) denotes the one-dimensional Fourier transform of 
7^(/)(t, 7 ) as a function of with 7 fixed. 

Proof. Since / G <S(M 3 ), for every positive integer N there is a con¬ 
stant An < 00 so that 

(i+ |flHi +1^1/(^ 


if we recall that x = t^f u, where 7 is orthogonal to u. Therefore, as 
soon as AT" > 3, we find 

(1 + < A N j^ (1+ ^ D iv <°°- 

A similar argument for the derivatives shows that 7^(/)(t, 7 ) G S(M) for 
each fixed 7 . 

To establish the identity, we first note that 

充 (/)(& 7 )= I" ( [ f]e- 2nist dt 

—OO yyJVt,-/ J 

=f f /(^7 + u\ei + U2e 2 ) du\ du2e~ 2nlst dt. 

J —00 */M 2 
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However, since 7 ■ 乜 = 0 and I 7 I = 1, we may write 

e—2irist _ 27ris7 - (tj-\-u) 


As a result, we find that 


叫 /)(s ， 7) 


f(t^ + u\e\ + U2e2)e -2ms7 .㈣ duidu2dt 
/( 亡 7 + uy-^^+n) dudt 


A final rotation from 7 , ei,e 2 to the standard basis in R 3 proves that 
兗 (/)( s ， 7 ) = / 卜 7 )， as desired. 

As a consequence of this identity, we can answer the uniqueness ques¬ 
tion for the Radon transform in M 3 in the affirmative. 

Corollary 5.3 // /, ^ G <S(M 3 ) and TZ(f) = TZ(g )， then f = g. 

The proof of the corollary follows from an application of the lemma to 
the difference f — g and use of the Fourier inversion theorem. 

Our final task is to give the formula that allows us to recover / from 
its Radon transform. Since TZ(f) is a function on the set of planes in 
M 3 , and / is a function of the space variables x G M 3 , to recover / we 
introduce the dual Radon transform, which passes from functions defined 
on planes to functions in R 3 . 

Given a function F on M x *S 2 , we define its dual Radon transform 
by 

(13) n*(F)(x)=f F(x- 7 , 7 ) da( 7 ). 

Js 2 

Observe that a point x belongs to Vt^ if and only if x • 7 = ^, so the idea 
here is that given x G M 3 , we obtain TZ*(F)(x) by integrating F over the 
subset of all planes passing through x, that is, 

n%F)(x) = [ F, 

J{Vt,-y such that x^Vt,^} 

where the integral on the right is given the precise meaning in (13). 
We use the terminology “dual” because of the following observation. If 
Vi = 5(M 3 ) with the usual Hermitian inner product 


(/， 办 = / r 3 /( 福办， 
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and V 2 = <S(M x S 2 ) with the Hermitian inner product 
(F, G) 2 = / [ F(t, 7)G(t,7) da(j) dt, 

JR JS 2 


then 

n ： V 1 ^ V 2 , 7Z* :V 2 ^V U 

with 


(i4) = 


The validity of this identity is not needed in the argument below, and 
its verification is left as an exercise for the reader. 

We can now state the reconstruction theorem. 

Theorem 5.4 // / G <S(R 3 )， then 


A(n*n(f)) = -8vr 2 /- 


We recall that A = 



is the Laplacian. 


Proof. By our previous lemma, we have 


叫 /)(*,7) 



f{ S1 )e M ds. 


Therefore 

hence 

△(龍 (/)) ⑻ 

is 2 
4 丌 2 

4 丌 2 

8 丌 2 


f(s 7 )e 2 H s dsda 

f(sj)(—47r 2 s 2 )e 27r，lx ' 7S ds da{^) 
f(s^)e 27rix ^ s s 2 dsda<cf) 


is 2 , 


f ㈣ e 2nix ， S 2 dsda(j) 




•S 2 Jo 


is 2 Jo 

f{s^()e 2，Klx ' ls s 2 ds dcr( 7 ) 


f{s^)e 27Tlx ' ls s 2 ds da( 7 ) 


8 丌 2 / ⑷. 
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In the first line, we have differentiated under the integral sign and used 
the fact A(e 2?rz:r，7S ) = (— 47 r 2 s 2 )e 27nrr . 7S , since 卜 | = 1 . The last step fol¬ 
lows from the formula for polar coordinates in R 3 and the Fourier inver¬ 
sion theorem. 

5.3 A note about plane waves 

We conclude this chapter by briefly mentioning a nice connection between 
the Radon transform and solutions of the wave equation. This comes 
about in the following way. Recall that when d = 1 ， the solution of 
the wave equation can be expressed as the sum of traveling waves (see 
Chapter 1), and it is natural to ask if an analogue of such traveling 
waves exists in higher dimensions. The answer is as follows. Let F be 
a function of one variable, which we assume is sufficiently smooth (say 
C 2 ), and consider u(x^ t) defined by 


u{x,t) = F((x . j) — t) 


where x G and 7 is a unit vector in It is easy to verify directly 
that ^ is a solution of the wave equation in (with c = 1). Such a 
solution is called a plane wave; indeed, notice that u is constant on 
every plane perpendicular to the direction 7 , and as time t increases, the 
wave travels in the 7 direction. (It should be remarked that plane waves 
are never functions in 5(M d ) when d > 1 because they are constant in 
directions perpendicular to 7). 5 

The basic fact is that when d > 1, the solution of the wave equation 
can be written as an integral (as opposed to sum, when d = 1) of plane 
waves; this can in fact be done via the Radon transform of the initial 
data / and g. For the relevant formulas when d = 3, see Problem 6 . 

6 Exercises 

1. Suppose that is a rotation in the plane R 2 , and let 



denote its matrix with respect to the standard basis vectors e\ — ( 1 , 0 ) and 


e 2 = (0,1). 


(a) Write the conditions R l — R _1 and det(R) = 土 1 in terms of equations in 


a, 6 , c, d. 


5 Incidentally, this observation is further indication that a fuller treatment of the wave 
equation requires lifting the restriction that functions belong to <S(R d ). 
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(b) Show that there exists G M such that a-\-ib — e lip . 

(c) Conclude that if R is proper, then it can be expressed as z h and if 
R is improper, then it takes the form 2 ： h ze lip : where ~z — x — iy. 

2. Suppose that : R 3 —> R 3 is a proper rotation. 

(a) Show that p(t) = det(R — tl) is a polynomial of degree 3, and prove that 
there exists 7 G 5 2 (where S 2 denotes the unit sphere in M 3 ) with 

i?( 7 ) = 7 . 

[Hint: Use the fact that p(0) > 0 to see that there is A 〉 0 with p(A) = 0. 
Then R — 入 I is singular, so its kernel is non-trivial.] 

(b) If V denotes the plane perpendicular to 7 and passing through the origin, 
show that 

R..V — V, 

and that this linear map is a rotation. 


3. Recall the formula 

f F(x) dx = f f F(r7)r d_1 dr daip/). 
jR d Jq 

Apply this to the special case when F(x) — 分 (r)/(7), where x = 7*7, to prove 
that for any rotation R, one has 


/ f[R( 1 ))d<r(j)= I /(7) 如 (7 )， 

/ S d-i Js 6 - 1 


whenever / is a continuous function on the sphere S d-1 . 


4. Let Ad and Vd denote the area and volume of the unit sphere and unit ball 
in M d , respectively. 

(a) Prove the formula 


_ 2 ?r d / 2 

rXd/2) 

so that A 2 = 27r, = 47 t, A 4 = 27r 2 ,_ Here T(x) = J 0 °° e~ l t x ~ x dt is 

the Gamma function. [Hint: Use polar coordinates and the fact that 
/ Rd e~^l 2 dx = 1.] 
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(b) Show that dVa — Ad, hence 




T d/2 


r(d/2 +1). 

In particular V 2 = 7r, V3 = 4 丌 /3, .... 
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5. Let A be a, d x d positive definite symmetric matrix with real coefficients. 
Show that 


/R d 


e -n( X ,A( X )) dx = (det(A))- 1 / 2 . 


This generalizes the fact that f Rd e~ n ^ 2 dx = 1, which corresponds to the case 
where A is the identity. 


[Hint: Apply the spectral theorem to write A = RDR- 1 where is a rotation 
and, D is diagonal with entries Ai, … ， A^, where {A^} are the eigenvalues of A.] 


6. Suppose -0 ^ «S(R d ) satisfies J |4(a;)| 2 da: = 1. Show that 

\x\ 2 \l!){x)\ 2 dx^j l^| 2 |^(0| 2 ^ > Y^ 2 - 
This is the statement of the Heisenberg uncertainty principle in d dimensions. 



7. Consider the time-dependent heat equation in M. d : 

. 、 du d 2 u d 2 u , 

(15) 瓦=两 +…+ 网， where t > 0i 

with boundary values u(x : 0) = f(x) G S(R d ). If 


O) 


(4 ： 7rt) d / 2 


-\x\ 2 /4t _ 


-47T 2 t\^\ 2 e 2TTiX-^ 




/R d 


is the d-dimensional heat kernel, show that the convolution 

u(x,t) = {f *n[ d) )(x) 

is indefinitely differentiable when x G and t > 0. Moreover, u solves (15), and 
is continuous up to the boundary t = 0 with u(x, 0) = /(x). 

The reader may also wish to formulate the d-dimensional analogues of Theo¬ 
rem 2.1 and 2.3 in Chapter 5. 


8. In Chapter 5, we found that a solution to the steady-state heat equation in the 
upper half-plane with boundary values / is given by the convolution u — j ^V y 
where the Poisson kernel is 

1 y 

Vy(x) = ---- where x G M and y > 0. 

7T x z y z 
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More generally, one can calculate the ^-dimensional Poisson kernel using the 
Fourier transform as follows. 

(a) The subordination principle allows one to write expressions involv¬ 
ing the function e~ x in terms of corresponding expressions involving the 
function e~ x . One form of this is the identity 

广 OO — u 

e _/3 = / ^—e~ (32/4u du 

Jo 

when p > 0. Prove this identity with (3 = 2tt\x\ by taking the Fourier 
transform of both sides. 

(b) Consider the steady-state heat equation in the upper half-space {(x, y ) : 

x G 2 / > 0} 

d 2 u d 2 u 
^ dx 2 - dy 2 0 


with the Dirichlet boundary condition u(x, 0) = /(x). A solution to this 
problem is given by the convolution u(x, y) = (/ * Py d ^)(x) where Py d ^ (x) 
is the d-dimensional Poisson kernel 

P^ d \x)= [ e 27vix <e~ 27v ^ y d^ 
jR d 


Compute Py d>) (x) by using the subordination principle and the d-dimensional 


heat kernel. (See Exercise 7.) Show that 


p(d) (x \ = ^((^+ 1 )/ 2 )_ y _ 

y K ; — 7T(d+l)/2 (| x |2 +y 2)(d+l)/2. 


9. A spherical wave is a solution u{pc^t) of the Cauchy problem for the wave 
equation in R d , which as a function of x is radial. Prove that w is a spherical 
wave if and only if the initial data f,gES are both radial. 


10. Let u(x, t) be a solution of the wave equation, and let E(t) denote the energy 
of this wave 


E ( t )= 




du 

dxj 


2 

(M) 


dx. 


We have seen that E(t) is constant using PlancherePs formula. Give an alternate 
proof of this fact by differentiating the integral with respect to t and showing 
that 


dE 

dt 


0 . 
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[Hint: Integrate by parts.] 

11 . Show that the solution of the wave equation 

d 2 u d 2 u d 2 u d 2 u 
dt 2 dx\ dxl dx\ 

subject to u(x, 0) = f(x) and 费 （ x, 0) = p(x), where /, ^ G <S(R 3 ), is given by 
u{x,t) = 0(^)1 L ) [tg(y) + f(y) + Vf(y) - (y - x)] da(y)， 

where 5(x, t) denotes the sphere of center x and radius and t)| its area. 
This is an alternate expression for the solution of the wave equation given in 
Theorem 3.6. It is sometimes called Kirchhoff’s formula. 


12. Establish the identity (14) about the dual transform given in the text. In 
other words, prove that 

(16) [ [ f f(x)n*(F)(x)dx 

Jr Js 2 Jr 3 

where / G 5(M 3 ), F G S(Rx S 2 ), and 

尺⑺ = / f and lV{F){x) = f _F(x . 7,7) da(7). 


[Hint: Consider the integral 


III 


/(t7 +^ie 2 


+ U2e2)F(t^ / y) dt da(j) dui du2. 


Integrating first in u gives the left-hand side of (16), while integrating in u and 
t and setting x — t^ + U\e 2 + ^ 2^2 gives the right-hand side.] 

13. For each (t : 6) with t G M and |^| < 7 r, let L = denote the line in the 
(a:, y)-plane given by 

xcos^ ysinO = t. 

This is the line perpendicular to the direction (cos 沒 ， sin 沒 ） at “distance” t from 
the origin (we allow negative t). For / G »S(IR 2 ) the X-ray transform or two- 
dimensional Radon transform of / is defined by 


x ( m ， e ) 



f(t cos 6 + u sin^, t sin 6 — u cos 6) du. 
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Calculate the X-ray transform of the function f[x,y ) 二 e -^{ x2 +y 2 ) m 


14. Let X be the X-ray transform. Show that if / G <S and X(/) = 0, then 
/ = 0, by taking the Fourier transform in one variable. 


15. For F G x 5 1 ), define the dual X-ray transform X*{F) by integrat¬ 
ing F over all lines that pass through the point (x, y) (that is, those lines L t ,Q 
with x cos 0 + y sin 6 — t)\ 

X* 、 F)(x ， y) = j F(x cos 0 -\-y sin0,6) dO. 

Check that in this case, if / G <S(R 2 ) and F G <S(R x S 1 ), then 

/ j x{ f ){t 1 e)Fmdtde = II f(x ， y)X*[F)(x ， y)dxdy. 


7 Problems 

1. Let J n denote the n th order Bessel function, for n G Z. Prove that 

(a) J n (p) is real for all real p. 

(b) j- n ( P ) = 

( c ) 2J^(p) = J n —i(p) — Jn+l(p). 

(d) Jn(p ) 二 + Jn+l(P). 

⑷ (p~ n J n (p)Y = -p _n J n +l(p). 

(f) (p n Jn(p)y = p n Jn-l(/9). 

(g) J n (p) satisfies the second order differential equation 

O) + P _1 ^n(p) + (1 _ n 2 /p 2 )J n (p) = 0. 

(h) Show that 

/p\ n ^ p 2rn 

勝 (f) 

(i) Show that for all integers n and all real numbers a and b we have 

Jnip 1 H - ~ 〉: J n —£(b). 

an 
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2 . Another formula for J n {p) that allows one to define Bessel functions for 
non-integral values of n, (n > — 1 / 2 ) is 


Jn(p) — 


(p/2)" 

r(n + l/2)y / 7T 



e^(l-t 2 ) n_(1/2) dt. 


(a) Check that the above formula agrees with the definition of J n {p) for in¬ 
tegral n > 0. [Hint: Verify it for n = 0 and then check that both sides 
satisfy the recursion formula (e) in Problem 1.] 


(b) Note that J\/ 2 {p) — 



sin p. 


(c) Prove that 


lim 

n - > — 1/2 





cosp. 


(d) Observe that the formulas we have proved in the text giving Fq in terms 
of /o (when describing the Fourier transform of a radial function) take the 
form 

(17) F 0 (p) = 27rp _(d/2)+1 f J(d/ 2 )-i{^pr)f 0 (r)r d/2 dr, 

Jo 

for d = 1,2, and 3, if one uses the formulas above with the understanding 
that J_i/ 2 (p) = lim n _^_i /2 J n (p)- It turns out that the relation between 
Fq and /o given by (17) is valid in all dimensions d. 


3. We observed that the solution u{x^t) of the Cauchy problem for the wave 
equation given by formula (3) depends only on the initial data on the base on 
the backward light cone. It is natural to ask if this property is shared by any 
solution of the wave equation. An affirmative answer would imply uniqueness of 
the solution. 

Let o,ro) denote the closed ball in the hyperplane t = 0 centered at xq 
and of radius r*o. The backward light cone with base B(xq, ro) is defined by 

乙 s(x 0 ， r 0 ) = { ( 工， t) eR d xR : \x - x 0 \ < r 0 - t, 0 < t < r 0 }. 


Theorem Suppose that u(x,t) is a C 2 function on the closed upper half-plane 
{(x, t) : x G t > 0} that solves the wave equation 


d 2 u 

W 


= Au. 


If u(x y 0)= 费 (ar ， 0) = 0 for all x G B(xo, r 0 ), then u(x ， t) = Q for all (x,t) G 

^B(xo,ro ) - 
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In words, if the initial data of the Cauchy problem for the wave equation 
vanishes on a ball B, then any solution u of the problem vanishes in the backward 
light cone with base B. The following steps outline a proof of the theorem. 

(a) Assume that u is real. For 0 < t < ro let B t (xo, r。) = {x : \x — xq\ < ro — 
t}, and also define 


Vu(x,t) 

Now consider the energy integral 


du 

dx\ 


du du 

dx d ' dt y 


-g (尝） 


_ = l f \Vu\ 2 dx 

Z ^B t (x 0 ,r 0 ) 

_i r fdu ^ 2 d 

2 JB t (x 0 ,r 0 ) 

Observe that E{t) > 0 and E{0) = 0. Prove that 
" 、 f du d 2 u du d 2 u If , , 

m = mW + ^d^dt dx -2 


(b) Show that 


d 

dx i 


du du 
dxj dt 


du d2u 


dxj dxjdt dx 2 - dt 


(c) Use the last identity, the divergence theorem, and the fact that u solves 
the wave equation to prove that 


、 f du du w 、 

五⑴ =/叫 。，石 巧 W (7) - 


2 


dB t (xo ， ro) 


|Vw| 2 da(7), 


where Vj denotes the j th coordinate of the outward normal to B t (xo ： ro). 

(d) Use the Cauchy-Schwarz inequality to conclude that 


du du ^ 1 l2 


and as a result, E r {t) < 0. Deduce from this that E(t) — 0 and w = 0. 
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4.* There exist formulas for the solution of the Cauchy problem for the wave 
equation 


d 2 u d 2 u d 2 u 

dt 2 dx\ + + dx 2 d 


with u{x, 0) = f(x) and 


瓦 (x,0) = 


in x M in terms of spherical means which generalize the formula given in the 
text for d = 3. In fact, the solution for even dimensions is deduced from that for 
odd dimensions, so we discuss this case first. 


Suppose that d > 1 is odd and let h G S(M. d ). The spherical mean of h on the 
ball centered at x of radius t is defined by 


M r h(x) — Mh(x,r)= 



h(x — r7) c/cr(7), 


where Ad denotes the area of the unit sphere 6^ _1 in R d . 


(a) Show that 


A x Mh(x, r) — 



Mh(x ， r), 


where denotes the Laplacian in the space variables rr, and d r — d/dr. 


(b) Show that a twice differentiable function u(x, t) satisfies the wave equation 
if and only if 


劣 + 





where Mu(x^ r, t) denote the spherical means of the function u(x : t). 

(c) If d = + 1, define Tip(r) — (r~ 1 d r ) k ~ 1 [r 2k ~ 1 (p(r)], and let u = TMu. 

Then this function solves the one-dimensional wave equation for each fixed 
x: 


dfu(x, r, t) = d^u(x,r, t). 

One can then use d’Alembert’s formula to find the solution U(x,r ， t) of 
this problem expressed in terms of the initial data. 

(d) Now show that 

u(x,t) = Mu(x, 0,t) = lim 

r—>0 OiT 

where a = 1.3 … （d — 2). 
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(e) Conclude that the solution of the Cauchy problem for the d-dimensional 
wave equation, when d > 1 is odd, is 

U(x,t)= 13 二 _ 2) [dtit^dt)^- 3 ^ 2 (t d ~ 2 M t f(x)) + 

(f 1 汍 )( d _ 3 )/ 2 (t d ~ 2 M t g(x))]. 


5.* The method of descent can be used to prove that the solution of the Cauchy 
problem for the wave equation in the case when d is even is given by the formula 


u(x ， t) 


1.3...(c?-2) 


邮 _ 1 5 t ) (d - 3)/2 (t d - 2 M t f{x)^ 




where M t denotes the modified spherical means defined by 

2 f f(x + ty) 


M t h(x) 


Ai+i J B d yjl \y\ 2 


dy. 


6 .* Given initial data / and g of the form 

f(x) = F(x - 7 ) and g(x) = G(x-^y), 
check that the plane wave given by 


/ x-^y—t 


is a solution of the Cauchy problem for the d-dimensional wave equation. 

In general, the solution is given as a superposition of plane waves. For the 
case d = 3， this can be expressed in terms of the Radon transform as follows. 
Let 


咒議 7 ) = -‘（！） 


Then u(x, t) 


is 2 




^•(/)(* i - t,i) + n{f)(x -7 + t, 7 ) ■ 


T^(g)(s ， l)ds 


'X'-y—t 


d(7(7)_ 
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7. For every real number a > 0, define the operator (—A) a by the formula 

(-Arf(x) = [ ( 2 吨 | 产 /( 伽 2 吨屯 

JR d 

whenever / G S(R d ). 

(a) Check that (—A) a agrees with the usual definition of the a th power of 
—A (that is, a compositions of minus the Laplacian) when a is a positive 
integer. 

(b) Verify that (—A) a (/) is indefinitely differentiable. 

(c) Prove that if a is not an integer, then in general (—A) a (/) is not rapidly 
decreasing. 

(d) Let u(x, y) be the solution of the steady-state heat equation 

_ + E^ =0 ，with u( Xl 0) = f(x), 

given by convolving / with the Poisson kernel (see Exercise 8). Check 
that 

(~A) 1/2 f{x) = - 

y—Q Oy 

and more generally that 

(-△)" 2 / ㈤ =(_i) fc iim ( 工， y) 

y^O oy K 

for any positive integer k. 


8* The reconstruction formula for the Radon transform in is as follows: 

(a) When d = 2 ， 

where (—A) 1 / 2 is defined in Problem 7. 

(b) If the Radon transform and its dual are defined by analogy to the cases 
d — 2 and d = 3, then for general 

(- △严 = /. 
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Finite Fourier Analysis 


This past year has seen the birth, or rather the re¬ 
birth, of an exciting revolution in computing Fourier 
transforms. A class of algorithms known as the fast 
Fourier transform or FFT, is forcing a complete re¬ 
assessment of many computational paths, not only in 
frequency analysis, but in any fields where problems 
can be reduced to Fourier transforms and/or convolu¬ 
tions... 


C. Bingham and J. W. Tukey, 1966 


In the previous chapters we studied the Fourier series of functions on 
the circle and the Fourier transform of functions defined on the Euclidean 
space M. d . The goal here is to introduce another version of Fourier analy¬ 
sis, now for functions defined on finite sets, and more precisely, on finite 
abelian groups. This theory is particularly elegant and simple since infi¬ 
nite sums and integrals are replaced by finite sums, and thus questions 
of convergence disappear. 

In turning our attention to finite Fourier analysis, we begin with the 
simplest example, Z(iV), where the underlying space is the (multiplica¬ 
tive) group of N th roots of unity on the circle. This group can also be 
realized in additive form, as Z/iVZ, the equivalence classes of integers 
modulo N. The group Z(AT) arises as the natural approximation to the 
circle (as N tends to infinity) since in the first picture the points of 
correspond to N points on the circle which are uniformly distributed. For 
this reason, in practical applications, the group Z(7V) becomes a natural 
candidate for the storage of information of a function on the circle, and 
for the resulting numerical computations involving Fourier series. The 
situation is particularly nice when N is large and of the form N = 2 n . 
The computations of the Fourier coefficients now lead to the “fast Fourier 
transform,” which exploits the fact that an induction in n requires only 
about log iV steps to go from N = 1 to N = 2 n . This yields a substantial 
saving in time in practical applications. 

In the second part of the chapter we undertake the more general the¬ 
ory of Fourier analysis on finite abelian groups. Here the fundamental 
example is the multiplicative group Z*(g). The Fourier inversion formula 
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for Z* ⑷ will be seen to be a key step in the proof of Dirichlefs theorem 
on primes in arithmetic progression, which we will take up in the next 
chapter. 

1 Fourier analysis on Z(N) 

We turn to the group of N th roots of unity. This group arises naturally as 
the simplest finite abelian group. It also gives a uniform partition of the 
circle, and is therefore a good choice if one wishes to sample appropriate 
functions on the circle. Moreover, this partition gets finer as N tends to 
infinity, and one might expect that the discrete Fourier theory that we 
discuss here tends to the continuous theory of Fourier series on the circle. 
In a broad sense, this is the case, although this aspect of the problem is 
not one that we develop. 

1.1 The group Z(iV) 

Let TV be a positive integer. A complex number 2 ： is an N th root of 
unity if 2：^ = 1. The set of N th roots of unity is precisely 

f 1 2iri/N 2ni2/N 27ri(A^—1)/A^ 1 

1 丄 ， e ，...，e r • 

Indeed, suppose that z N = 1 with z = re l6 . Then we must have r N e lNe = 
1, and taking absolute values yields r = 1. Therefore e lNe = 1, and this 
means that NO = 2jrk where fc G Z. So if C = e 27Tl / N we find that 
exhausts all the N th roots of unity. However, notice that ^ = 1 so if 
n and m differ by an integer multiple of TV, then ( n = In fact, it is 
clear that 

if and only if n — mis divisible by N. 

We denote the set of all iV th roots of unity by Z(iV). The fact that 
this set gives a uniform partition of the circle is clear from its definition. 
Note that the set Z(iV) satisfies the following properties: 

(i) If z,w G Z(AT), then zw G Z(N) and zw = wz. 

(ii) leZ(N). 

(iii) li z E Z(A^), then z _1 = I/2: G Z(N) and of course zz~ 1 = 1. 

As a result we can conclude that Z(iV) is an abelian group under complex 
multiplication. The appropriate definitions are set out in detail later in 
Section 2.1. 
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1 



Z(9), C = e 2 ^/ 9 Z(N), N = 2 6 

Figure 1 . The group of N th roots of unity when N = 9 and N = 2 6 = 
64 


There is another way to visualize the group Z(N). This consists of 
choosing the integer power of C that determines each root of unity. We 
observed above that this integer is not unique since whenever n 

and m differ by an integer multiple of N. Naturally, we might select the 
integer which satisfies 0 < n < iV — 1. Although this choice is perfectly 
reasonable in terms of “sets,” we ask what happens when we multiply 
roots of unity. Clearly, we must add the corresponding integers since 
= (n+m b u t nothing guarantees that 0<n-\-m<N — 1. In fact, 
if C n C m = with 0 < fc < AT — 1, then n-\- m and k differ by an integer 
multiple of N. So, to find the integer in [0, TV — 1] corresponding to the 
root of unity C n C m ， we see that after adding the integers n and m we 
must reduce modulo N, that is, find the unique integer 0 < k < N — 1 
so that (n + m) — /c is an integer multiple of N. 

An equivalent approach is to associate to each root of unity cu the 
class of integers n so that (^ n = uj. Doing so for each root of unity we 
obtain a partition of the integers in N disjoint infinite classes. To add 
two of these classes, choose any integer in each one of them, say n and 
m, respectively, and define the sum of the classes to be the class which 
contains the integer n + m. 

We formalize the above notions. Two integers x and y are congru¬ 
ent modulo N if the difference x _ y is divisible by N, and we write 
x 三 y mod N. In other words, this means that x and y differ by an 
integer multiple of N. It is an easy exercise to check the following three 
properties: 
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• x = x mod N for all integers x. 

• li x = y mod TV, then y 三 x mod N. 

• If a: 三 y mod N and y 三 z mod N, then x = z mod N. 

The above defines an equivalence relation on Z. Let R(x) denote the 
equivalence class, or residue class, of the integer x. Any integer of the 
form x + kN with fc G Z is an element (or “representative”）of R(x). 
In fact, there are precisely N equivalence classes, and each class has a 
unique representative between 0 and N — 1. We may now add equiva¬ 
lence classes by defining 

R(x) + R(y) = R(x + y). 

This definition is of course independent of the representatives x and y 
because if x' G R(x) and y r G R(y), then one checks easily that x r -\- y r ^ 
R(x + y). This turns the set of equivalence classes into an abelian group 
called the group of integers modulo iV, which is sometimes denoted 
by Z/iVZ. The association 

R(k) < — > e ^ik/N 

gives a correspondence between the two abelian groups, %\WL and Z(7V). 
Since the operations are respected, in the sense that addition of inte¬ 
gers modulo N becomes multiplication of complex numbers, we shall 
also denote the group of integers modulo N by Z(N). Observe that 
0 G Z/iVZ corresponds to 1 on the unit circle. 

Let V and W denote the vector spaces of complex-valued functions on 
the group of integers modulo N and the N th roots of unity, respectively. 
Then, the identification given above carries over to V and W as follows: 

F(k) <~> f(e 27rik/N ), 

where F is a function on the integers modulo N and / is a function on 
the N th roots of unity. 

From now on, we write Z(iV) but think of either the group of integers 
modulo N or the group of N th roots of unity. 


1.2 Fourier inversion theorem and Plancherel identity on Z(iV) 

The first and most crucial step in developing Fourier analysis on Z(iV) is 
to find the functions which correspond to the exponentials e n (x) = e 27r2nrr 
in the case of the circle. Some important properties of these exponentials 


are: 
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(i) {e n } n ^z is an orthonormal set for the inner product (1) (in Chap¬ 
ter 3) on the space of Riemann integrable functions on the circle. 

(ii) Finite linear combinations of the e n ’s (the trigonometric polyno¬ 
mials) are dense in the space of continuous functions on the circle. 

(iii) e n (x + y) 二 e n (x)e n (y). 

On Z(A^), the appropriate analogues are the N functions eo, … ， e_/v-i 
defined by 

e £ (k) = C ik = e 2irUk/N for € = 0, •. • ,iV - 1 and A: = 0,. • • ,iV - 1 ， 

where C = e 27Tl ^ N . To understand the parallel with (i) and (ii), we can 
think of the complex-valued functions on Z(N) as a vector space V", 
endowed with the Hermitian inner product 


N-l 

k=0 

and associated norm 

N-l 

= E 剛 2 . 

k=0 

Lemma 1.1 The family {eo,.. •, ejsf-i} is orthogonal. In fact, 

( 、 (N if m = £, 

(e m ,e e ) - I Q ifm ^ L 

Proof. We have 

N-l N-l 

(e m ,e e )^J2 d 伐 =E 
k=0 k=0 

If m = each term in the sum is equal to 1, and the sum equals N. If 
m ★ then q = ( m —”s not equal to 1, and the usual formula 

l+q + q 2 + 

1 - Q 

shows that (e m , e^) = 0, because q N = 1. 


Since the N functions eo , … ,en are orthogonal, they must be lin¬ 
early independent, and since the vector space V is TV-dimensional, we 
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conclude that {eo, … ， ejv-i} is an orthogonal basis for V. Clearly, prop¬ 
erty (iii) also holds, that is, e^(k + m) = ei(k)e^(m) for all and all 
G Z(iV). 

By the lemma each vector e£ has norm \/iV, so if we define 

e| = 7n ee, 

then {eg, …， e^_ 1 } is an orthonormal basis for V. Hence for any F E ： V 
we have 

N-l N-l 

⑴ F = YAF 乂 )e* n as well as ||F|| 2 = 乙 |(F ， <)| 2 . 

n=0 n=0 

If we define the n th Fourier coefficient of F by 

. N-l 

= me- 2nikn/N , 

k=0 

the above observations give the following fundamental theorem which is 
the Z(iV) version of the Fourier inversion and the Parseval-Plancherel 
formulas. 

Theorem 1.2 If F is a function on X[N), then 

N-l 

m - E 

n=0 

Moreover, 

N-l 1 N-l 

^2 i° n i 2 = I 尸⑻ i 2 . 

n=0 fc=0 

The proof follows directly from (1) once we observe that 
a n = y (F ， e n ) = (F ， e*). 

Remark. It is possible to recover the Fourier inversion on the circle 
for sufficiently smooth functions (say C 2 ) by letting N —> oo in the finite 
model Z(N) (see Exercise 3). 
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1.3 The fast Fourier transform 

The fast Fourier transform is a method that was developed as a means 
of calculating efficiently the Fourier coefficients of a function F on Z(N). 

The problem, which arises naturally in numerical analysis, is to deter¬ 
mine an algorithm that minimizes the amount of time it takes a computer 
to calculate the Fourier coefficients of a given function on Z(iV). Since 
this amount of time is roughly proportional to the number of operations 
the computer must perform, our problem becomes that of minimizing 
the number of operations necessary to obtain all the Fourier coefficients 
{a n } given the values of F on Z(iV). By operations we mean either an 
addition or a multiplication of complex numbers. 

We begin with a naive approach to the problem. Fix N ， and suppose 
that we are given F(0), …， F(N — 1) and = e ~ 27Tl ^ N . If we denote 
by (F) the A: th Fourier coefficient of F on Z(iV), then by definition 

1 N-l 

a^(F)F(rX 

r=0 

and crude estimates show that the number of operations needed to cal¬ 
culate all Fourier coefficients is < 27V 2 + N• Indeed, it takes at most 
N — 2 multiplications to determine ,... and each coefficient 

requires TV + 1 multiplications and N — 1 additions. 

We now present the fast Fourier transform, an algorithm that im¬ 
proves the bound 0(N 2 ) obtained above. Such an improvement is possi¬ 
ble if, for example, we restrict ourselves to the case where the partition 
of the circle is dyadic, that is, TV = 2 n . (See also Exercise 9.) 

Theorem 1.3 Given = e~ 27Tl ^ N with N = 2 n , it is possible to calcu¬ 
late the Fourier coefficients of a function on Z(iV) with at most 

4 • 2 n n = 4N\og 2 [N) = 0(N log N) 


operations. 

The proof of the theorem consists of using the calculations for M 
division points, to obtain the Fourier coefficients for 2M division points. 
Since we choose N = 2 n , we obtain the desired formula as a consequence 
of a recurrence which involves n = 0(log N) steps. 

Let #(M) denote the minimum number of operations needed to cal¬ 
culate all the Fourier coefficients of any function on Z(M). The key to 
the proof of the theorem is contained in the following recursion step. 
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Lemma 1.4 If we are given u ； 2 M = e~ 27Vl ^ 2M \ then 

#(2M) < 2#(M) + 8M. 

Proof. The calculation of 山 2 M,.. •, requires no more than 2M 
operations. Note that in particular we get ujm = e _27r ^ M = 山 | M . The 
main idea is that for any given function F on Z(2M), we consider two 
functions F 0 and Fi on Z(M) defined by 

Fo(r) = F(2r) and Fi(r) = F(2r + 1). 

We assume that it is possible to calculate the Fourier coefficients of Fo 
and F\ in no more than #(M) operations each. If we denote the Fourier 
coefficients corresponding to the groups Z(2M) and Z(M) by a| M and 
af, respectively, then we have 

4 M (F) - 2 (af ( 凡 ) + af - 

To prove this, we sum over odd and even integers in the definition of the 
Fourier coefficient a| M (F), and find 

2M-1 

a\ M {F) = ^2 -P 1 ( r ) a; 2 M 

r =0 

— 1 Ad — 1 

去 E +^E F(2m + l). k 2 ^ 

£=0 m =0 

1 1 — 1 -j — 1 

= 2 ( M ^ + T7 

V i =0 m =0 

which establishes our assertion. 

As a result, knowing a^(Fo), a^(Fi), and u^ M , we see that each 
a\ M {F) can be computed using no more than three operations (one ad¬ 
dition and two multiplications). So 

#(2M) < 2M + 2#(M) + 3 x 2M = 2#(M) + 8M, 

and the proof of the lemma is complete. 

An induction on n, where N = 2 n , will conclude the proof of the the¬ 
orem. The initial step n = 1 is easy, since N = 2 and the two Fourier 
coefficients are 

a^(F) - ^ ( 尸⑴ + F(-1)) and af(F) ^ ^ (F ⑴ + (-l)F(-l)). 
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Calculating these Fourier coefficients requires no more than five opera¬ 
tions, which is less than 4x2 = 8. Suppose the theorem is true up to 
N = 2 n ~ 1 so that < 4 - 2 n_1 (n — 1). By the lemma we must have 

#(2AT) <2-4-2 n_1 (n -1) + 8 - 2 n_1 = 4 • 2 n n, 

which concludes the inductive step and the proof of the theorem. 

2 Fourier analysis on finite abelian groups 

The main goal in the rest of this chapter is to generalize the results about 
Fourier series expansions obtained in the special case of Z(7V). 

After a brief introduction to some notions related to finite abelian 
groups, we turn to the important concept of a character. In our set¬ 
ting, we find that characters play the same role as the exponentials 
e 。， … ， e_/v-i on the group Z(iV), and thus provide the key ingredient 
in the development of the theory on arbitrary finite abelian groups. In 
fact, it suffices to prove that a finite abelian group has “enough” charac¬ 
ters, and this leads automatically to the desired Fourier theory. 

2.1 Abelian groups 

An abelian group (or commutative group) is a set G together with a 
binary operation on pairs of elements of G, (a, 6 ) i—>■ a • 6 , that satisfies 
the following conditions: 

(i) Associativity., a . (6 • c) = (a . 6 ) . c for all a, 6 , c G G. 

(ii) Identity.. There exists an element u E G (often written as either 1 
or 0) such that a . u = u . a = a for all a E G. 

(iii) Inverses: For every a G G, there exists an element a -1 G G such 
that a - a -1 = a -1 . a = u. 

(iv) Commutativity : For a, 6 G G, we have a • b = b • a. 

We leave as simple verifications the facts that the identity element and 
inverses are unique. 

Warning. In the definition of an abelian group, we used the “multi¬ 
plicative” notation for the operation in G. Sometimes, one uses the u ad- 
ditive” notation a + 6 and —a, instead of a • 6 and a -1 . There are times 
when one notation may be more appropriate than the other, and the 
examples below illustrate this point. The same group may have different 
interpretations, one where the multiplicative notation is more suggestive, 
and another where it is natural to view the group with addition, as the 
operation. 
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Examples of abelian groups 

• The set of real numbers R with the usual addition. The identity is 
0 and the inverse of x is —x. 

Also, R — {0} and M + = {x G M : x > 0} equipped, with the stan¬ 
dard multiplication, are abelian groups. In both cases the unit is 1 
and the inverse of x is 1/x. 

• With the usual addition, the set of integers Z is an abelian group. 
However, Z — {0} is not an abelian group with the standard mul¬ 
tiplication, since, for example, 2 does not have a multiplicative 
inverse in Z. In contrast, Q — {0} is an abelian group with the 
standard multiplication. 

• The unit circle S 1 in the complex plane. If we view the circle as 
the set of points {e l0 : 6 G M}, the group operation is the standard 
multiplication of complex numbers. However, if we identify points 
on S 1 with their angle 0, then S 1 becomes M modulo 2 丌 ， where the 
operation is addition modulo 2 [ 

• Z(A7") is an abelian group. Viewed as the N th roots of unity on the 
circle, Z(N) is a group under multiplication of complex numbers. 
However, if Z(iV) is interpreted as Z/iVZ, the integers modulo TV, 
then it is an abelian group where the operation is addition modulo 
N. 

• The last example consists of Z*(g). This group is defined as the set 
of all integers modulo q that have multiplicative inverses, with the 
group operation being multiplication modulo q. This important 
example is discussed in more detail below. 

A homomorphism between two abelian groups G and i/ is a map 
f : G ^ H which satisfies the property 

f(a.b、= f(a). f(b), 

where the dot on the left-hand side is the operation in G, and the dot 
on the right-hand side the operation in H. 

We say that two groups G and H are isomorphic, and write G ^ 
if there is a bijective homomorphism from G to H. Equivalently, G and 
H are isomorphic if there exists another homomorphism / : i/ —> G, so 
that for all a G G and b ^ H 

(/°/)(a) = a and (/ 0 /)( 6 ) = b. 
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Roughly speaking, isomorphic groups describe the “same” object because 
they have the same underlying group structure (which is really all that 
matters); however, their particular notational representations might be 
different. 

Example 1. A pair of isomorphic abelian groups arose already when 
we considered the group Z(iV). In one representation it was given as 
the multiplicative group of 7V th roots of unity in C. In a second repre¬ 
sentation it was the additive group Z/iVZ of residue classes of integers 
modulo N. The mapping n i—> i?(n), which associates to a root of unity 
z = e 2 ^ in / N — the residue class in Z/iVZ determined by n, provides 
an isomorphism between the two different representations. 

Example 2. In parallel with the previous example, we see that the circle 
(with multiplication) is isomorphic to the real numbers modulo 2tt (with 
addition). 

Example 3. The properties of the exponential and logarithm guarantee 
that 

exp : M —> M+ and log : > M 

are two homomorphisms that are inverses of each other. Thus M (with 
addition) and (with multiplication) are isomorphic. 

In what follows, we are primarily interested in abelian groups that are 
finite. In this case, we denote by |G| the number of elements in G, and 
call \G\ the order of the group. For example, the order of Z(iV) is N. 

A few additional remarks are in order: 

• If G\ and G 2 are two finite abelian groups, their direct product 
G\ x G 2 is the group whose elements are pairs (^ 1 , ^ 2 ) with g\ G G\ 
and g 2 三 G 2 . The operation in G\ x G 2 is then defined by 

( 々 i ， "2). (g’i ， g’ 2 ) = 92 • %). 

Clearly, if G\ and G 2 are finite abelian groups, then so is G\ x G2. 
The definition of direct product generalizes immediately to the case 
of finitely many factors Gi x G 2 x ■ ■ • x G n . 

• The structure theorem for finite abelian groups states that such a 
group is isomorphic to a direct product of groups of the type Z(iV); 
see Problem 2. This is a nice result which gives us an overview of 
the class of all finite abelian groups. However, since we shall not 
use this theorem below, we omit its proof. 
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We now discuss briefly the examples of abelian groups that play a 
central role in the proof of Dirichlefs theorem in the next chapter. 

The group Z*(g) 

Let ^ be a positive integer. We see that multiplication in Z(g) can be 
unambiguously defined, because if n is congruent to n f and m is congruent 
to m! (both modulo q、, then nm is congruent to in!m! modulo q. An 
integer n G Z(g) is a unit if there exists an integer m G Z(^) so that 

nm 三 1 mod q. 

The set of all units in Z(g) is denoted by Z*(g), and it is clear from our 
definition that Z*(^) is an abelian group under multiplication modulo q. 
Thus within the additive group Z(g) lies a set Z*(g) that is a group under 
multiplication. An alternative characterization of Z*(g) will be given in 
the next chapter, as those elements in Z(q) that are relatively prime to q. 

Example 4. The group of units in Z(4) = {0,1,2,3} is 

Z*(4) = {1,3}. 

This reflects the fact that odd integers are divided into two classes de¬ 
pending on whether they are of the form 4A: + 1 or 4fc + 3. In fact, Z*(4) 
is isomorphic to Z(2). Indeed, we can make the following association: 

Z*(4) Z(2) 

1 ^> 0 

3 ^> 1 

and then notice that multiplication in Z*(4) corresponds to addition in 
Z(2). 

Example 5. The units in Z(5) are 

Z*(5) = {1 ， 2,3,4}. 

Moreover, Z*(5) is isomorphic to Z(4) with the following identification: 


Z*(5) 


z ⑷ 
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Example 6. The units in Z(8) = {0,1,2,3,4,5,6,7} are given by 

Z*(8)-{1,3,5,7}. 

In fact, Z*(8) is isomorphic to the direct product Z(2) x Z(2). In this 
case, an isomorphism between the groups is given by the identification 


2.2 Characters 

Let G be a finite abelian group (with the multiplicative notation) and 
5" 1 the unit circle in the complex plane. A character on G is a complex¬ 
valued function e : G ^ S 1 which satisfies the following condition: 

(2) e(a - b) = e(a)e(6) for all a, 6 G G. 

In other words, a character is a homomorphism from G to the circle 
group. The trivial or unit character is defined by e(a) = 1 for all 
CL G G. 

Characters play an important role in the context of finite Fourier se¬ 
ries, primarily because the multiplicative property (2) generalizes the 
analogous identity for the exponential functions on the circle and the 
law 

ee(k + m) = e^(fc)e^(m), 

which held for the exponentials eo, … ， e_/v-i used in the Fourier theory 
on Z(iV). There we had e^(k) = = e 2lTM / N ^ with 0 < £ < N — 1 and 

k G Z(7V), and in fact, the functions eo, … ,ejv-i are precisely all the 
characters of the group Z(N). 

If G is a finite abelian group, we denote by G the set of all characters 
of G, and observe next that this set inherits the structure of an abelian 
group. 




\ ― / \ ―/ \ —/ \—/ 

o o 1 1 

1 ± 1 ± 



* 


Z 
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Lemma 2.1 The set G is an abelian group under multiplication defined 
by 

(ei . e 2 )(a) = ei(a)e 2 (a) for all a G G. 

The proof of this assertion is straightforward if one observes that the 
trivial character plays the role of the unit. We call G the dual group 
of G. 

In light of the above analogy between characters for a general abelian 
group and the exponentials on Z(A^), we gather several more examples 
of groups and their duals. This provides further evidence of the central 
role played by characters. (See Exercises 4, 5, and 6.) 

Example 1. If G = Z(iV), all characters of G take the form e^(fc) = (^ k = 

e 2 ivi£k/N f or some 0 < £ < iV — 1, and it is easy to check that i—> £ gives 

_■ — ■ — 

an isomorphism from Z(N) to Z(iV). 

Example 2. The dual group of the circle 1 is precisely {e n } ne z (where 
e n (x) = e 2?rm：E ). Moreover, e n i—^ n gives an isomorphism between S 1 
and the integers Z. 

Example 3. Characters on M are described by 

e^(x) - e 2 ^ x where ^ € R. 

Thus i—> ^ is an isomorphism from R to R. 

Example 4. Since exp : M —>• R+ is an isomorphism, we deduce from the 
previous example that the characters on R+ are given by 

e^x) = x 2 ^ = e 27r ^ logx where ^ G M, 

and R+ is isomorphic to K. (or M + ). 

The following lemma says that a nowhere vanishing multiplicative 
function is a character, a result that will be useful later. 

Lemma 2.2 Let G be a finite abelian group, and e : G —^ C — {0} a mul¬ 
tiplicative function, namely e(a - b) = e(a)e(b) for all a，b E G. Then e is 
a character. 


iln addition to (2), the definition of a character on an infinite abelian group requires 
continuity. When G is the circle, R, or R+, the meaning of “continuous” refers to the 
standard notion of limit. 
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Proof. The group G being finite, the absolute value of e(a) is bounded 
above and below as a ranges over G. Since |e(6 n )| = |e(6)| n , we conclude 
that \e(b)\ = 1 for all 6 G G. 

The next step is to verify that the characters form an orthonormal 
basis of the vector space V of functions over the group G. This fact 
was obtained directly in the special case G = Z(N) from the explicit 
description of the characters eo, … ， ejv-i. 

In the general case, we begin with the orthogonality relations; then we 
prove that there are “enough” characters by showing that there are as 
many as the order of the group. 

2.3 The orthogonality relations 

Let V denote the vector space of complex-valued functions defined on the 
finite abelian group G. Note that the dimension of V is |G|, the order of 
G. We define a Hermitian inner product on V by 

(3) (/, g) = t4[ f( a )g( a ), whenever f,g G V. 

aeG 

Here the sum is taken over the group and is therefore finite. 

Theorem 2.3 The characters of G form an orthonormal family with 
respect to the inner product defined above. 

Since |e ⑷ | = 1 for any character, we find that 

(e, e ) = jir e (°) e ( 0 ) = jir XI l e (°)l 2 = L 

aeG aeG 

If e # e’ and both are characters, we must prove that (e, e r ) = 0; we 
isolate the key step in a lemma. 

Lemma 2.4 If e is a non-trivial character of the group G, then 
SaGG e ( a ) = 

Proof. Choose 6 G G such that e(b) ^ 1. Then we have 

e(b) e(o) = e(6)e ⑷二 ^ e(ab )= E e ⑷. 

cl^. G cl^G cl^G G 

The last equality follows because as a ranges over the group, ab ranges 
over G as well. Therefore e ⑷ = 0. 
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We can now conclude the proof of the theorem. Suppose e! is a char¬ 
acter distinct from e. Because e(e’) — 1 is non-trivial, the lemma implies 
that 

J2 e ⑷ (e’ ⑷ ) _1 = 0. 

CL^.G 

Since (e’ ⑷ )— 1 = e’(a), the theorem is proved. 

As a consequence of the theorem, we see that distinct characters are 
linearly independent. Since the dimension of V over C is |G|, we conclude 
that the order of G is finite and < |G|. The main result to which we now 
turn is that, in fact, \G\ = \G\. 

2.4 Characters as a total family 

The following completes the analogy between characters and the complex 
exponentials. 

Theorem 2.5 The characters of a finite abelian group G form a basis 
for the vector space of functions on G. 

There are several proofs of this theorem. One consists of using the 
structure theorem for finite abelian groups we have mentioned earlier, 
which states that any such group is the direct product of cyclic groups, 
that is, groups of the type Z(AT). Since cyclic groups are self-dual, using 
this fact we would conclude that |G| = |G|, and therefore the characters 
form a basis for G. (See Problem 3.) 

Here we shall prove the theorem directly without these considerations. 

Suppose F is a vector space of dimension d with inner product 
A linear transformation T : V ^ V is unitary if it preserves the inner 
product, (Tv'Tw) = (t?, w) for all v,w ^V. The spectral theorem from 
linear algebra asserts that any unitary transformation is diagonalizable. 
In other words, there exists a basis {i ； i , …， Vd\ (eigenvectors) of V such 
that T{yi) = X{Vi, where G C is the eigenvalue attached to Vi. 

The proof of Theorem 2.5 is based on the following extension of the 
spectral theorem. 

Lemma 2.6 Suppose {Ti,..., T^} is a commuting family of unitary trans¬ 
formations on the finite-dimensional inner product space V; that is, 

TiTj = TjTi for all ij. 

Then Ti,..., are simultaneously diagonalizable. In other words, there 
exists a basis for V which consists of eigenvectors for every 
Ti, i = 1,..., fc. 
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Proof. We use induction on k. The case fc = 1 is simply the spec¬ 
tral theorem. Suppose that the lemma is true for any family of fc — 1 
commuting unitary transformations. The spectral theorem applied to Tk 
says that V is the direct sum of its eigenspaces 

y = Kxi ㊉…㊉ V \ s ， 

where denotes the subspace of all eigenvectors with eigenvalue A^. 
We claim that each one of the …， Tk-i maps each eigenspace V\ i to 
itself. Indeed, if G V\ i and 1 < j < k — 1, then 

T k Tj(y) = TjT k {y) = Tj(Xiv) = KTj(v) 

so Tj{y) G V\^ and the claim is proved. 

Since the restrictions to V\ i of …， T^-i form a family of commut¬ 
ing unitary linear transformations, the induction hypothesis guarantees 
that these are simultaneously diagonalizable on each subspace V^. This 
diagonalization provides us with the desired basis for each V\^ and thus 
for V. 

We can now prove Theorem 2.5. Recall that the vector space V of 
complex-valued functions defined on G has dimension \G\. For each 
a G G we define a linear transformation T a : V ^ V by 

(T a f)(x) = f(a - x) for x e G. 

Since G is abelian it is clear that T a Tb = TbT a for all a, 6 G G, and one 
checks easily that T a is unitary for the Hermitian inner product (3) de¬ 
fined on V. By Lemma 2.6 the family {Ta\ a ^G is simultaneously di¬ 
agonalizable. This means there is a basis {vb(x)}beG for V such that 
each Vb(x) is an eigenfunction for T a , for every a. Let v be one of these 
basis elements and 1 the unit element in G. We must have v(l) ^ 0 for 
otherwise 


v(a) = v(a - 1) = (T a v)(l) = A a v(l) = 0, 

where is the eigenvalue of v for T a . Hence u = 0, and this is a contra¬ 
diction. We claim that the function defined by w(x) = X x = v(x)/v(l) 
is a character of G. Arguing as above we find that w(x) ^ 0 for every x, 
and 

… _ v ( a ' h ) _ 入 〆 b) — 、、 v(l) _ 、、 _ ⑴ 

We now invoke Lemma 2.2 to conclude the proof. 
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2.5 Fourier inversion and Plancherel formula 

We now put together the results obtained in the previous sections to 
discuss the Fourier expansion of a function on a finite abelian group G. 
Given a function f on G and character e of G, we define the Fourier 
coefficient of / with respect to e, by 

/(e) = (/ ， e) = E /( a ) 硕， 

aeG 

and the Fourier series of / as 

f /( e ) e - 

eeG 

Since the characters form a basis, we know that 

f = c e e 
eeG 

for some set of constants c e . By the orthogonality relations satisfied by 
the characters, we find that 


(/， e) = c e ， 

so / is indeed equal to its Fourier series, namely, 

/( e ) e - 

eeG 


We summarize our results. 

Theorem 2.7 Let G be a finite abelian group. The characters of G form 
an orthonormal basis for the vector space V of functions on G equipped 
with the inner product 

/( a )^W• 

|LT| aeG 

In particular, any function f on G is equal to its Fourier series 

/( e ) e - 

eeG 

Finally, we have the Parseval-Plancherel formula for finite abelian 
groups. 
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Theorem 2.8 If f is a function on G, then ||/|| 2 = ^ |/(e)| 2 . 

eEG 

Proof. Since the characters of G form an orthonormal basis for the 
vector space V, and (/, e) = /(e), we have that 

ii/ii 2 = (/，/) = E(/ ， 福 =E_i 2 . 

eGG e^G 


The apparent difference of this statement with that of Theorem 1.2 
is due to the different normalizations of the Fourier coefficients that are 
used. 

3 Exercises 

1. Let / be a function on the circle. For each N > 1 the discrete Fourier 
coefficients of / are defined by 

1 N 

a N {n) = —^ j( e 2 们 fc/JV) e -2 町 h/iv ， for n6Z , 

fc=l 

We also let 



denote the ordinary Fourier coefficients of /. 

(a) Show that ajv(n) = aN^n + N). 

(b) Prove that if / is continuous, then cln{ji) — a(n) as AT ^ oo. 


2. If / is a C 1 function on the circle, prove that |ajv(^)| ^ c/|n| whenever 
0 < |n| < N/2. 

[Hint: Write 

i N 

a N (n)[l - e 2Mn / N ] = — ^[/(e 27rlfc/JV ) - /(e 27 ri(fc+ ^ /JV )]e _27rifcr! 

fc = l 

and choose i so that bn/N is nearly 1/2.] 

3. By a similar method, show that if / is a C 2 function on the circle, then 

|^iv(^)| < c/|n| 2 , whenever 0 < |n| < N/2. 
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As a result, prove the inversion formula for / G C 2 , 

oo 

f(e 27rix ) = ^ a{n)e 2ninx 

n=—oo 

from its finite version. 

[Hint: For the first part, use the second symmetric difference 

j( e 27^(fc+^)/iV) + J( e 27ri(fc-_£)/iV) _ 2f(e 2nik ^ N ). 

For the second part, if N is odd (say), write the inversion formula as 
/(e 2^fc/iV) = J2 a N (n)e 27rikn/N .) 

\n\<N/2 

4. Let e be a character on G — Z(AT), the additive group of integers modulo N. 
Show that there exists a unique 0<£<iV—Iso that 

e(k) = e t (k) = e 2lrUk/N for all k 6 Z(N). 

Conversely, every function of this type is a character on Z(iV). Deduce that 
£ defines an isomorphism from G to G. 

[Hint: Show that e(l) is an 7V th root of unity.] 

5. Show that all characters on S 1 are given by 

e n (x) = e 27rinx with n E Z, 

and check that e n i—^ n defines an isomorphism from S 1 to Z. 

[Hint: If F is continuous and F(x + y) = F(x)F(y), then F is differentiable. To 
see this, note that if F(0) ^ 0, then for appropriate S, c = F(y) dy ★ 0, and 
cF(x) = F(y) dy. Differentiate to conclude that F(x) = e Ax for some A] 

6. Prove that all characters on R take the form 

e^x) = e 2 ^ x with ^ G R, 

and that ^ defines an isomorphism from R to R. The argument in Exercise 5 
applies here as well. 

7. Let C — e 2，Kl ^ N . Define the N x N matrix M = (cijk)i<j^k<N by ajk = 
TV-" 2 。' 


(a) Show that M is unitary. 
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(b) Interpret the identity (Mu, Mv) = (u,v) and the fact that M* 二 M _1 in 
terms of Fourier series on Z(7V). 


8. Suppose that P(x) — 


a n e 


(a) Show by using the Parseval identities for the circle and Z(A/"), that 

rl 1 N 

\P{x)\ 2 dx = ^r^2\P(j/N)\ 2 . 


/o 


(b) Prove the reconstruction formula 

N 


P{x) = X] P{j/N)I<{x - (j/N)) 


where 


K(x )= 


^2nix ^ _ ^2iviNx 

N 1 - e 27rix 


_ ^2nix _|_ ^2ni2x + 




+ e* 


27riNx\ 


Observe that P is completely determined by the values P(j/N) for 1 < j < N. 
Note also that K(0) = 1, and K(j/N) = 0 whenever j is not congruent to 0 
modulo N. 


9. To prove the following assertions, modify the argument given in the text. 

(a) Show that one can compute the Fourier coefficients of a function on Z(7V) 
when N 二 3 n with at most 6N log 3 N operations. 

(b) Generalize this to AT = a n where a is an integer > 1. 


10. A group G is cyclic if there exists g E G that generates all of G, that is, 
if any element in G can be written as g n for some n G Z. Prove that a finite 
abelian group is cyclic if and only if it is isomorphic to Z(N) for some N. 

11. Write down the multiplicative tables for the groups Z*(3), Z*(4), Z*(5), 
Z*(6), Z*(8), and Z*(9). Which of these groups are cyclic? 

12. Suppose that G is a finite abelian group and e : G ^ C is a function that 
satisfies e(x - y) — e(x)e(y) for all x,y e G. Prove that either e is identically 0, 
or e never vanishes. In the second case, show that for each x, e(x) = e 27rir for 
some r G Q of the form r — p/q : where q = |(7|. 
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13. In analogy with ordinary Fourier series, one may interpret finite Fourier 
expansions using convolutions as follows. Suppose G is a finite abelian group, 
1(5 its unit, and V the vector space of complex-valued functions on G. 


(a) The convolution of two functions / and ^ in F is defined for each a E G 
by 

(f* 9)(a) = f ⑼咖 . &_1 ). 

丨丨 6eG 


A —-- - A 

Show that for all e G G one has (/ * g)(e ) 二 f(e)g(e). 

(b) Use Theorem 2.5 to show that if e is a character on G, then 


e(c) = 0 whenever c E G and c ^ 1q- 

eeG 


(c) As a result of (b), show that the Fourier series Sf(a) = ^2 e€ a /(e)e(a) of 
a function / G F takes the form 

where D is defined by 
(4) ZXc) = ^>( C ) = 

eeG 

Since /*£) = /，we recover the fact that Sf = /. Loosely speaking, D 
corresponds to a “Dirac delta function” ； it has unit mass 

] k \^ D{c) = h 

丨丨 ceG 

and (4) says that this mass is concentrated at the unit element in G. Thus 
D has the same interpretation as the “limit” of a family of good kernels. 
(See Section 4, Chapter 2.) 

Note. The function D reappears in the next chapter as 占 i(n). 


|G| ifc=l G , 
0 otherwise. 


4 Problems 

1. Prove that if n and m are two positive integers that are relatively prime, then 


Z(nm) « Z(n) x Z(m). 
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[Hint: Consider the map Z(nm) ^ Z(n) x Z(m) given by A: (A: mod n, k mod 
771)，and use the fact that there exist integers x and y such that xn + ym =1.] 

2.* Every finite abelian group G is isomorphic to a direct product of cyclic 
groups. Here are two more precise formulations of this theorem. 

• If pi,..., p s are the distinct primes appearing in the factorization of the 
order of G : then 

G^G( Pl )x---xG( Ps ), 

where each G(p) is of the form G(p) = Z(p ri ) x •.. x Z(p r ^), with 0 < 
ri < • • • < (this sequence of integers depends on p of course). This 
decomposition is unique. 

• There exist unique integers di, … ，dk such that 

^l|^2) ^2^3? … ， dk-l\dk 

and 

G « Z(di) x … x Z(dk). 


Deduce the second formulation from the first. 


3. Let G denote the collection of distinct characters of the finite abelian group 
G. 


(a) Note that if (7 = Z(A/"), then G is isomorphic to G. 

(b) Prove that G± x G 2 = Gi x G 2 - 

(c) Prove using Problem 2 that if G is a finite abelian group, then G is iso¬ 
morphic to G. 

4.* When p is prime the group Z*(p) is cyclic, and Z*(p) « Z(p — 1). 
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Dirichlefs Theorem 


Dirichlet, Gustav Lejeune (Diiren 1805-Gottingen 1859), 
German mathematician. He was a number theorist at 
heart. But, while studying in Paris, being a very like¬ 
able person, he was befriended by Fourier and other 
like-minded mathematicians, and he learned analysis 
from them. Thus equipped, he was able to lay the 
foundation for the application of Fourier analysis to 
(analytic) theory of numbers. 

S. Bochner, 1966 


As a striking application of the theory of finite Fourier series, we now 
prove Dirichlefs theorem on primes in arithmetic progression. This the¬ 
orem states that if q and £ are positive integers with no common factor, 
then the progression 

£， £ + < + 2g, < + 3g, •••，£+ • 

contains infinitely many prime numbers. This change of subject matter 
that we undertake illustrates the wide applicability of ideas from Fourier 
analysis to various areas outside its seemingly narrower confines. In this 
particular case, it is the theory of Fourier series on the finite abelian 
group Z*(g) that plays a key role in the solution of the problem. 

1 A little elementary number theory 

We begin by introducing the requisite background. This involves elemen¬ 
tary ideas of divisibility of integers, and in particular properties regarding 
prime numbers. Here the basic fact, called the fundamental theorem of 
arithmetic, is that every integer is the product of primes in an essentially 
unique way. 


1.1 The fundamental theorem of arithmetic 

The following theorem is a mathematical formulation of long division. 
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Theorem 1.1 (Euclid’s algorithm) For any integers a and b with 
b > 0, there exist unique integers q and r with 0 < r < b such that 

a = qb r. 

Here q denotes the quotient of a by 6, and r is the remainder, which 
is smaller than b. 

Proof. First we prove the existence of q and r. Let S denote the set 
of all non-negative integers of the form a — qb with ^ G Z. This set is 
non-empty and in fact S contains arbitrarily large positive integers since 
b ★ 0. Let r denote the smallest element in 5, so that 

r = a — qb 

for some integer q. By construction 0 < r, and we claim that r < b. If 
not, we may write r = b s with 0 < s < r : so b s = a — qb, which then 
implies 


s = a — (q-\- 1)6. 

Hence s ^ S with s < r, and this contradicts the minimality of r. 
So r < 6, hence q and r satisfy the conditions of the theorem. 

To prove uniqueness, suppose we also had a = qib + r\ where 
0 < r*i < 6. By subtraction we find 


(q - qi)b = n — r. 

The left-hand side has absolute value 0 or > 6, while the right-hand side 
has absolute value < b. Hence both sides of the equation must be 0, 
which gives q = qi and r = r\. 

An integer a divides b if there exists another integer c such that 
ac = 6; we then write a\b and say that a is a divisor of b. Note that in 
particular 1 divides every integer, and a\a for all integers a. A prime 
number is a positive integer greater than 1 that has no positive divisors 
besides 1 and itself. The main theorem in this section says that any 
positive integer can be written uniquely as the product of prime numbers. 

The greatest common divisor of two positive integers a and b is the 
largest integer that divides both a and b. We usually denote the greatest 
common divisor by gcd(a, b). Two positive integers are relatively prime 
if their greatest common divisor is 1. In other words, 1 is the only positive 
divisor common to both a and b. 
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Theorem 1.2 If gcd(a, b) = d, then there exist integers x and y such 
that 

ax by = d. 

Proof. Consider the set S of all positive integers of the form ax + by 
where x, y G Z, and let s be the smallest element in S. We claim that s = 
d. By construction, there exist integers x and y such that 

ax by = s. 

Clearly, any divisor of a and b divides s, so we must have d < s. The proof 
will be complete if we can show that s|a and s|6. By Euclid’s algorithm, 
we can write a = qs r with 0 < r < 5. Multiplying the above by q we 
find qax + qby = qs, and therefore 

qax + qby = a — r. 

Hence r = a(l — qx) + b(—qy). Since 5 was minimal in S and 0 < r < 5, 
we conclude that r = 0, therefore s divides a. A similar argument shows 
that s divides 6, hence s = d as desired. 

In particular we record the following three consequences of the theo¬ 
rem. 

Corollary 1.3 Two positive integers a and b are relatively prime if and 
only if there exist integers x and y such that ax by = 1. 

Proof. If a and b are relatively prime, two integers x and y with the 
desired property exist by Theorem 1.2. Conversely, if ax by = 1 holds 
and d is positive and divides both a and 6, then d divides 1, hence d = 1. 


Corollary 1.4 If a and c are relatively prime and c divides ab, then c 
divides b. In particular, if p is a prime that does not divide a and p 
divides ab, then p divides b. 


Proof. We can write 1 = ax cy, so multiplying by b we find b = 
abx + cby. Hence c\b. 

Corollary 1.5 If p is prime and p divides the product - ■ ■ a r , then p 
divides di for some i. 


Proof. By the previous corollary, if p does not divide ai, then p 
divides a 2 ■•- a r , so eventually p\ai ， 

We can now prove the main result of this section. 
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Theorem 1.6 Every positive integer greater than 1 can be factored 
uniquely into a product of primes. 

Proof. First, we show that such a factorization is possible. We 
do so by proving that the set S of positive integers > 1 which do not 
have a factorization into primes is empty. Arguing by contradiction, we 
assume that i? _ 0. Let n be the smallest element of S. Since n cannot 
be a prime, there exist integers a > 1 and 6 > 1 such that ab = n. But 
then a < n and 6 < n, so a _ 5 as well as 6 ^ 5. Hence both a and b 
have prime factorizations and so does their product n. This implies 
n 朱 S, therefore S is empty, as desired. 

We now turn our attention to the uniqueness of the factorization. Sup¬ 
pose that n has two factorizations into primes 

n = PlP2 '''Vr 
= ㈣ 2 … (Is. 

So pi divides qiq^ ■ ■ • and we can apply Corollary 1.5 to conclude that 
Pi\qi for some i. Since qi is prime, we must have pi = qi ， Continuing 
with this argument we find that the two factorizations of n are equal up 
to a permutation of the factors. 

We briefly digress to give an alternate definition of the group Z*(g) 
which appeared in the previous chapter. According to our initial defini¬ 
tion, Z*(g) is the multiplicative group of units in Z(^): those n G 
for which there exists an integer m so that 

(1) nm 三 1 mod q. 

Equivalently, Z*(g) is the group under multiplication of all integers in 
Z(g) that are relatively prime to q. Indeed, notice that if (1) is satisfied, 
then automatically n and q are relatively prime. Conversely, suppose 
we assume that n and q are relatively prime. Then, if we put a = n 
and b = q in Corollary 1.3, we find 

nx qy = 1. 

Hence nx 三 1 mod g, and we can take m = x to establish the equiva¬ 
lence. 

1.2 The infinitude of primes 

The study of prime numbers has always been a central topic in arithmetic, 
and the first fundamental problem that arose was to determine whether 
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there are infinitely many primes or not. This problem was solved in 
Euclid’s Elements with a simple and very elegant argument. 

Theorem 1.7 There are infinitely many primes. 

Proof. Suppose not, and denote by pi,...,p n the complete set of 
primes. Define 

N = p!p 2 -■-p n -\- 1 - 

Since N is larger than any pi ： the integer N cannot be prime. Therefore, 
N is divisible by a prime that belongs to our list. But this is also an 
absurdity since every prime divides the product, yet no prime divides 1. 
This contradiction concludes the proof. 

Euclid’s argument actually can be modified to deduce finer results 
about the infinitude of primes. To see this, consider the following prob¬ 
lem. Prime numbers (except for 2) can be divided into two classes de¬ 
pending on whether they are of the form 4fc + 1 or 4fc + 3, and the above 
theorem says that at least one of these classes has to be infinite. A natu¬ 
ral question is to ask whether both classes are infinite, and if not, which 
one is? In the case of primes of the form + 3, the fact that the class is 
infinite has a proof that is similar to Euclid’s, but with a twist. If there 
are only finitely many such primes, enumerate them in increasing order 
omitting 3, 

Pi = 7, P2 = 11， . • • ， Pn ， 

and let 

N = Apip 2 ■•■p n + 3. 

Clearly, N is of the form 4A: + 3 and cannot be prime since N > p n . 
Since the product of two numbers of the form 4m + 1 is again of the 
form 4m + 1, one of the prime divisors of N , say p, must be of the form 
4fc + 3. We must have p _ 3, since 3 does not divide the product in the 
definition of N . Also, p cannot be one of the other primes of the form 
4fc + 3, that is, p ★ pi for i = 1,... n, because then p divides the product 
Pi - ■ - Pn but does not divide 3. 

It remains to determine if the class of primes of the form 4fc + 1 is 
infinite. A simple-minded modification of the above argument does not 
work since the product of two numbers of the form 4m + 3 is never of the 
form 4m + 3. More generally, in an attempt to prove the law of quadratic 
reciprocity, Legendre formulated the following statement: 
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If q and i are relatively prime, then the sequence 
£ + kq, k E Z 

contains infinitely many primes (hence at least one prime!). 

Of course, the condition that q and l be relatively prime is necessary, 
for otherwise 彳 + A:g is never prime. In other words, this hypothesis says 
that any arithmetic progression that could contain primes necessarily 
contains infinitely many of them. 

Legendre’s assertion was proved by Dirichlet. The key idea in his proof 
is Euler’s analytical approach to prime numbers involving his product 
formula, which gives a strengthened version of Theorem 1.7. This insight 
of Euler led to a deep connection between the theory of primes and 
analysis. 

The zeta function and its Euler product 

We begin with a rapid review of infinite products. If is a se¬ 

quence of real numbers, we define 

oo N 

IT A n = lim TT A n 

N — ^OO 

n=l n=l 

if the limit exists, in which case we say that the product converges. The 
natural approach is to take logarithms and transform products into sums. 
We gather in a lemma the properties we shall need of the function logx, 
defined for positive real numbers. 

Lemma 1.8 The exponential and logarithm functions satisfy the follow¬ 
ing properties: 

(i) e logx = x. 

(ii) log(l -\- x) = x -\- E{x) where \E(x)\ < x 2 if \x\ < 1/2. 

(iii) If log(l -\- x) = y and \x\ < 1/2, then \y\ < 2\x\. 

In terms of the O notation, property (ii) will be recorded as 
log(l + x) = x + 0(x 2 ). 

Proof. Property (i) is standard. To prove property (ii) we use the 
power series expansion of log(l + x) for | 尤 | < 1, that is, 


( 2 ) 


log(l + 工 ） =E 


(—l ) n+1 


x'\ 
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Then we have 

E(x) = log(l + a; ) — x —誓 + …， 

and the triangle inequality implies 

\^( x )\ ^ + kl + \ x \ 2 + ■••). 

Therefore, if \x\ < 1/2 we can sum the geometric series on the right-hand 
side to find that 


陣 )1 


2 


x 


< 

一 2 

< x 2 . 


2 2 2 
1 \ 


, 1 - 1 / 2 , 


The proof of property (iii) is now immediate; if a; 7 ^ 0 and |x| < 1/2, then 


log(l + x) 

X 


< 1 + 


E{x) 

X 


< 1 + |a ； 

< 2 , 


and if x = 0 , (iii) is clearly also true. 


We can now prove the main result on infinite products of real numbers. 

Proposition 1.9 If A n = 1 + a n and \a n \ converges, then the prod¬ 
uct Yl n A n converges, and this product vanishes if and only if one of 
its factors A n vanishes. Also, if a n ^ 1 for all n, then 1/(1 — a n ) 
converges. 

Proof. If J2 \ a n\ converges, then for all large n we must have |a n | < 
1/2. Disregarding finitely many terms if necessary, we may assume that 
this inequality holds for all n. Then we may write the partial products 
as follows: 


N N 

H 4 二 e 1 °g( 1 +«n) ^ 

n=l n=l 

where = ^2 二 =1 b n with b n = log(l + a n ). By the lemma, we know 
that \b n \ < 2|a n |, so that Bn converges to a real number, say B. Since 
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the exponential function is continuous, we conclude that e BN converges 
to e B as N goes to infinity, proving the first assertion of the proposition. 
Observe also that if 1 + a n # 0 for all n, the product converges to a 
non-zero limit since it is expressed as e B . 

Finally observe that the partial products of Yl n 1/(1 — a n ) are 
1/ n^Li(l — a n ), so the same argument as above proves that the product 
in the denominator converges to a non-zero limit. 

With these preliminaries behind us, we can now return to the heart of 
the matter. For s a real number (strictly) greater than 1, we define the 

zeta function by 


oo 

n=l 



To see that the series defining C converges, we use the principle that 
whenever / is a decreasing function one can compare ^ f{ n ) with 
/ f(x) dx, as is suggested by Figure 1. Note also that a similar tech¬ 
nique was used in Chapter 3, that time bounding a sum from below by 
an integral. 



Figure 1. Comparing sums with integrals 


Here we take f(x) = l/x s to see that 




dx 

X s 
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and therefore 


(3) 



Clearly, the series defining converges uniformly on each half-line 
s> so > 1， hence is continuous when 5 > 1. The zeta function was 
already mentioned earlier in the discussion of the Poisson summation 
formula and the theta function. 

The key result is Euler’s product formula. 

Theorem 1.10 For every 5 > 1, we have 


c n! 


1/纩 


where the product is taken over all primes. 

It is important to remark that this identity is an analytic expression 
of the fundamental theorem of arithmetic. In fact, each factor of the 
product 1/(1 — p~ s ) can be written as a convergent geometric series 



So we consider 



where the product is taken over all primes, which we order in increasing 
order pi < pi < •••. Proceeding formally (these manipulations will be 
justified below), we calculate the product as a sum of terms, each term 
originating by picking out a term l/pj S (in the sum corresponding to 
Pj) with a fc, which of course will depend on and with fc = 0 for j 
sufficiently large. The product obtained this way is 



By the fundamental theorem of arithmetic, each integer > 1 occurs in 
this way uniquely, hence the product equals 
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We now justify this heuristic argument. 

Proof. Suppose M and N are positive integers with M > N. Observe 
now that any positive integer n < N can be written uniquely as a product 
of primes, and that each prime must be less than or equal to N and 
repeated less than M times. Therefore 



Letting N tend to infinity now yields 



For the reverse inequality, we argue as follows. Again, by the fundamen¬ 
tal theorem of arithmetic, we find that 


n 

p<n 






1 

pMs 



Letting M tend to infinity gives 



and the proof of the product formula is complete. 

We now come to Euler’s version of Theorem 1.7, which inspired Dirich- 
let 5 s approach to the general problem of primes in arithmetic progression. 
The point is the following proposition. 
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Proposition 1.11 The series 


v 

diverges，when the sum is taken over all primes p. 

Of course, if there were only finitely many primes the series would 
converge automatically. 

Proof. We take logarithms of both sides of the Euler formula. Since 
logx is continuous, we may write the logarithm of the infinite product 
as the sum of the logarithms. Therefore, we obtain for 5 > 1 

-X^og(l - l/p S ) = log CO). 

V 

Since log(l x) = x 0(\x\ 2 ) whenever \x\ < 1/2, we get 

[~ l /p s + °(Vp 2s )] = iogc(s), 

p 

which gives 

^l/p s + 0 (l) = logC(s). 

V 

The term 0(1) appears because l/p 2s < V 几 2 . Now we let s 

tend to 1 from above, namely 5 —> 1 + , and note that (^(s) —> oo since 
l/n s > Yju=i and therefore 

oo M 

lim inf ^ l/n s > ^ 1/n for every M. 

S ~* n=l n=l 

We conclude that V〆 — > oo as 5 —> 1 + , and since 1/p > l/p s for all 
s > 1 , we finally have that 


= oo. 

p 


In the rest of this chapter we see how Dirichlet adapted Euler’s insight. 
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2 Dirichlef s theorem 

We remind the reader of our goal: 

Theorem 2.1 If q and £ are relatively prime positive integers, then there 
are infinitely many primes of the form £-\- kq with k ^ Z. 

Following Euler’s argument, Dirichlet proved this theorem by showing 
that the series 



diverges, where the sum is over all primes congruent to £ modulo q. Once 
q is fixed and no confusion is possible, we write p = £ to denote a prime 
congruent to l modulo q. The proof consists of several steps, one of 
which requires Fourier analysis on the group Z*(g). Before proceeding 
with the theorem in its complete generality, we outline the solution to 
the particular problem raised earlier: are there infinitely many primes of 
the form + 1? This example, which consists of the special case g = 4 
and -£ = 1, illustrates all the important steps in the proof of Dirichlet’s 
theorem. 

We begin with the character on Z*(4) defined by x(l) = 1 and 
x(3) = —1. We extend this character to all of Z as follows: 


0 if n is even, 


X(n) 


1 if n = 4fc + 1, 
1 if n = 4fc + 3. 


Note that this function is multiplicative, that is, x(nm) = x( n )x( m ) on 
all of Z. Let L(s, x) — X( n )/ nS , so that 



Then L(l,x) is the convergent series given by 


Since the terms in the series are alternating and their absolute values 
decrease to zero we have L(l,x) _ 0. Because % is multiplicative, the 
Euler product generalizes (as we will prove later) to give 


v 


X(n) 




x(p)/p s 
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Taking the logarithm of both sides, we find that 

logL( S , X ) = [ 辱 + 0(1). 

V P 

Letting 5 —>• 1+, the observation that L(l, x) # 0 shows that x(p)/p s 
remains bounded. Hence 

it p s ^ p s 

is bounded as 5 —> 1 + . However, we know from Proposition 1.11 that 


E 


p s 


is unbounded as s —> 1 + , so putting these two facts together, we find 
that 


2E 


p s 


is unbounded as 5 —>• 1 + . Hence ^2 p =i 1/p diverges, and as a consequence 
there are infinitely many primes of the form + 1. 

We digress briefly to show that in fact L(l, x) = 7r/4 . To see this, we 
integrate the identity 


1 


l-\- x 2 


l-x 2 + x A -x 6 + ■■- 


and get 


^ dx 


/o l + X 2 


- 




0 <y <1. 


We then let y tend to 1. The integral can be calculated as 
f 1 dx 


/o 1 


arctan-u 


o 


7T 


so this proves that the series 1 — 1/3+1/5 —… is Abel summable to 
7r/4. Since we know the series converges, its limit is the same as its Abel 
limit, hence 1 — 1/3 + 1/5 — • • ■ = 7r/4. 


The rest of this chapter gives the full proof of Dirichlet’s theorem. We 
begin with the Fourier analysis (which is actually the last step in the 
example given above), and reduce the theorem to the non-vanishing of 
L-functions. 
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2.1 Fourier analysis, Dirichlet characters, and reduction of the 
theorem 


In what follows we take the abelian group G to be Our formulas 

below involve the order of G, which is the number of integers 0 < n < 
q that are relatively prime to q\ this number defines the Euler phi- 
function and |G| = 

Consider the function 5^ on G, which we think of as the characteristic 
function of if n G then 


和 ㈤ = { o 


if n 三 £ mod q, 
otherwise. 


We can expand this function in a Fourier series as follows: 


5e{n) = ^ ^(e)e(n), 
eeG 


where 


Hence 


忍 (e) = [ 5e(m)e(m) = ^ e{l). 

丨丨 meG 丨丨 

如 ⑹ =E e ⑺ e(n). 

丨丨 „广 A 


We can extend the function 5i to all of Z by setting 5^{m) = 0 whenever m 
and q are not relatively prime. Similarly, the extensions of the characters 


e G G to all of Z which are given by the recipe 

X ㈣ = 

are called the Dirichlet characters modulo q. We shall denote the 


e(m) if m and q are relatively prime 
0 otherwise, 


extension to Z of the trivial character of G by xo? so that xo( 爪 ) = 1 if 
m and q are relatively prime, and 0 otherwise. Note that the Dirichlet 
characters modulo q are multiplicative on all of Z, in the sense that 


xijim) = x( n )x( m ) for all n, m G Z. 


Since the integer q is fixed, we may without fear of confusion, speak of 
“Dirichlet characters” omitting reference to q. 1 

With |G| = we may restate the above results as follows: 


1 We use the notation x instead of e to distinguish the Dirichlet characters (defined on 
Z) from the characters e (defined on Z* (q)). 
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Lemma 2.2 The Dirichlet characters are multiplicative. Moreover, 

1 


Se(m) 


咖 ) 




where the sum is over all Dirichlet characters. 

With the above lemma we have taken our first step towards a proof of 
the theorem, since this lemma shows that 




Mp) 

p s 


^ p P s 

Thus it suffices to understand the behavior of x(p)p~ s as 5 — > 1 + . In 
fact, we divide the above sum in two parts depending on whether or not 
X is trivial. So we have 


E 


1 


E 


Xo(p) 


p s Aq) „ p s Aq) 




⑷ 


x/xo 

1 


x(p) 

p s 


Hq) 


— i- 

x/xo 




x{p) 

p s 


Since there are only finitely many primes dividing g, Euler’s theorem 
(Proposition 1.11) implies that the first sum on the right-hand side di¬ 
verges when 5 tends to 1. These observations show that Dirichlet’s the¬ 
orem is a consequence of the following assertion. 

Theorem 2.3 If x is a nontrivial Dirichlet character, then the sum 


E 


x(p) 

p s 


remains bounded as 5 —^ 1 + . 


The proof of Theorem 2.3 requires the introduction of the L-functions, 
to which we now turn. 


2.2 Dirichlet L-functions 

We proved earlier that the zeta function C( s ) = Y!n s could be 
pressed as a product, namely 


=n 


i 


(i—p- 
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Dirichlet observed an analogue of this formula for the so-called L- functions 
defined for 5 > 1 by 

T ( 、 x( n ) 

n=l 


where x is a Dirichlet character. 
Theorem 2.4 If s > 1， then 


y-x(n) 

n s 

n=l 


=n 

p 


l 


(i - x ( p ) p ~ s ) 


where the product is over all primes. 

Assuming this theorem for now, we can follow Euler’s argument for¬ 
mally: taking the logarithm of the product and using the fact that 
log(l + x) = x + 0(x 2 ) whenever x is small, we would get 


logi(s,x) = — X]log(l — x ( p )/ p s ) 


■E 


x(p) 

p s 


O 




2s 


E 警 + _• 

v 尸 


If L(l ， x) is finite and non-zero, then logL(s,x) is bounded as 5 —>• 1+, 
and we can conclude that the sum 

y- x(p) 

乙 1 P S 

P ^ 


is bounded as s —^ 1 + . We now make several observations about the 
above formal argument. 

First, we must prove the product formula in Theorem 2.4. Since the 
Dirichlet characters x can be complex-valued we will extend the loga¬ 
rithm to complex numbers w of the form w = 1/(1 — z) with \z\ < 1. 
(This will be done in terms of a power series.) Then we show that with 
this definition of the logarithm, the proof of Euler’s product formula 
given earlier carries over to L-functions. 

Second, we must make sense of taking the logarithm of both sides of the 
product formula. If the Dirichlet characters are real, this argument works 
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and is precisely the one given in the example corresponding to primes 
of the form 4fc + 1. In general, the difficulty lies in the fact that x(p) i s 
a complex number, and the complex logarithm is not single valued; in 
particular, the logarithm of a product is not the sum of the logarithms. 

Third, it remains to prove that whenever x 7 ^ Xo, then logL(s,x) is 
bounded as 5 —>• 1+. If (as we shall see) L( 5 ,x) is continuous at 5 = 1, 
then it suffices to show that 


1(1 ， X) 笋 o. 

This is the non-vanishing we mentioned earlier, which corresponds to the 
alternating series being non-zero in the previous example. The fact that 
i(l, X) _ 0 is the most difficult part of the argument. 

So we will focus on three points: 

1 . Complex logarithms and infinite products. 

2. Study of L(s, x). 

3. Proof that L(l,x) 7 ^ 0 if % is non-trivial. 

However, before we enter further into the details, we pause briefly to 
discuss some historical facts surrounding Dirichlefs theorem. 

Historical digression 

In the following list, we have gathered the names of those mathematicians 
whose work dealt most closely with the series of achievements related to 
Dirichlefs theorem. To give a better perspective, we attach the years in 
which they reached the age of 35: 

Euler 1742 
Legendre 1787 
Gauss 1812 
Dirichlet 1840 
Riemann 1861 

As we mentioned earlier, Euler’s discovery of the product formula for 
the zeta function is the starting point in Dirichlefs argument. Legendre 
in effect conjectured the theorem because he needed it in his proof of the 
law of quadratic reciprocity. However, this goal was first accomplished 
by Gauss who, while not knowing how to establish the theorem about 
primes in arithmetic progression, nevertheless found a number of different 
proofs of quadratic reciprocity. Later, Riemann extended the study of 
the zeta function to the complex plane and indicated how properties 
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related to the non-vanishing of that function were central in the further 
understanding of the distribution of prime numbers. 

Dirichlet proved his theorem in 1837. It should be noted that Fourier, 
who had befriended Dirichlet when the latter was a young mathematician 
visiting Paris, had died several years before. Besides the great activity in 
mathematics, that period was also a very fertile time in the arts, and in 
particular music. The era of Beethoven had ended only ten years earlier, 
and Schumann was now reaching the heights of his creativity. But the 
musician whose career was closest to Dirichlet was Felix Mendelssohn 
(four years his junior). It so happens that the latter began composing 
his famous violin concerto the year after Dirichlet succeeded in proving 
his theorem. 

3 Proof of the theorem 

We return to the proof of Dirichlet’s theorem and to the three difficulties 
mentioned above. 

3.1 Logarithms 

The device to deal with the first point is to define two logarithms, one 
for complex numbers of the form 1/(1 — z) with \z\ < 1 which we denote 
by log 1? and one for the function L(s, x) which we will denote by log 2 . 

For the first logarithm, we define 



Note that log! w is then defined if Re(w) > 1/2, and because of equa¬ 
tion (2), log ± w gives an extension of the usual logx when x is a real 
number > 1/2. 

Proposition 3.1 The logarithm function log! satisfies the following prop¬ 
erties: 


(i) If |z| < 1 ； then 



(ii) If | 2 ：| < 1, then 



where the error E\ satisfies \Ei(z)\ < \z\ 2 if |z| < 1/2. 
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<2\z 
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Proof. To establish the first property, let 2 ： = re l6 with 0 < r < 1, 
and observe that it suffices to show that 

(5) (1 - re ie ) e Er =1 ( reiS ) fc / fc = 1. 


To do so, we differentiate the left-hand side with respect to r, and this 
gives 

-e w + (1 - re ie ) (f^(re i0 ) k /k 
. \fc=l 

The term in brackets equals 

-e ie + (1 — re ie )e ie 1 

\fc=l 

Having found that the left-hand side of the equation (5) is constant, we 
set r = 0 and get the desired result. 

The proofs of the second and third properties are the same as their 
real counterparts given in Lemma 1.8. 

Using these results we can state a sufficient condition guaranteeing 
the convergence of infinite products of complex numbers. Its proof is the 
same as in the real case, except that we now use the logarithm logp 

Proposition 3.2 If \a n \ converges, and a n ^ 1 for all n, then 


AG 


+ (1 - re i6 )e ie 


1 — re 


i6 


0 . 






converges. Moreover, this product is non-zero. 

Proof. For n large enough, \a n \ < 1/2, so we may assume without 
loss of generality that this inequality holds for all n > 1. Then 
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But we know from the previous proposition that 


logi 


1 


1 -z. 




so the fact that the series E |a n | converges, immediately implies that the 
limit 


N / ! 


A 


exists. Since the exponential function is continuous, we conclude that 
the product converges to e" 4 , which is clearly non-zero. 

We may now prove the promised Dirichlet product formula 


e 譬 =n 


l 


(i - x(p)p~ s ) 


For simplicity of notation, let L denote the left-hand side of the above 
equation. Define 


Sn = ^2 ^ n ) n ~ S and n Ar = 


1 


n<N 


P<N 


x{p)p~ 


The infinite product II = limAr^oo TIn = H p ( 工 — 乂心广 ) converges by 
the previous proposition. Indeed, if we set a n = X(Pn)P: s , where p n is 
the n th prime, we note that if 5 > 1, then ^ \a n \ < oo. 

Also, define 


n^,M = n ( 1 

P<N V 


x{p) 

p s 


x(p M y 


p 


Ms 


Now fix e > 0 and choose N so large that 


I S jy — -£/| <C 6 and |IIjy _ XI| <C 6. 


We can next select M large enough so that 


\Sn — n_/v ， M| < e and \Un,m — n^l < e. 

To see the first inequality, one uses the fundamental theorem of arith¬ 
metic and the fact that the Dirichlet characters are multiplicative. The 
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second inequality follows merely because each series 


‘OO x(p n ) 

n=l p ns 


con¬ 


verges. 

Therefore 

_ n| < \l — 5^1 + \Sn — ii^mI + |njv,M — iijv| + |riiv — n| < 4e, 
as was to be shown. 

3.2 L-functions 

The next step is a better understanding of the L-functions. Their behav¬ 
ior as functions of s (especially near s = 1) depends on whether or not x 
is trivial. In the first case, L(s^xo) is up to some simple factors just the 
zeta function. 

Proposition 3.3 Suppose xo the trivial Dirichlet character, 



1 if n and q are relatively prime, 
0 otherwise, 


and q = p^ 1 - - is the prime factorization of q. Then 

L{s,xo) = (1 — — P ; S ). •. (1 — Pn S )C( s )- 


Therefore L(s, Xo) —^ oo as s — 1+. 

Proof. The identity follows at once on comparing the Dirichlet and 
Euler product formulas. The final statement holds because C(s) —> oo as 
s 1+. 

The behavior of the remaining L-functions, those for which X _ Xo, 
is more subtle. A remarkable property is that these functions are now 
defined and continuous for 5 > 0. In fact, more is true. 

Proposition 3.4 If x is a non-trivial Dirichlet character, then the series 


^2x(n)/n 


converges for s > 0, and we denote its sum by L(s, x)- Moreover: 

(i) The function L(s,x) continuously differentiable for 0 < s < oo. 

(ii) There exists constants c, c r > 0 so that 


L(s, x) = 1 + 0(e~ cs ) as s ^ oo, and 


V(s,x) = 0(e~ cs ) 


as s ^ oo. 
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We first isolate the key cancellation property that non-trivial Dirichlet 
characters possess, which accounts for the behavior of the L-function 
described in the proposition. 


Lemma 3.5 If \ a non-trivial Dirichlet character, then 


k 

X[n) <q, for any k. 


Proof. First, we recall that 


x(n) = 0 . 

n=l 


In fact, if S denotes the sum and a G Z*(g), then the multiplicative 
property of the Dirichlet character x gives 

x(a)S = X(a)X(n) = E x ( an ) = [ x(n) = S. 

Since x is non-trivial, x ⑷ # 1 for some a, hence 5 = 0. We now write 
k = aq b with 0 < 6 < g, and note that 

k aq 

E xW = E xW + E X(n )= H X(n), 

n=l n=l aq<n<aq-\-b aq<n<.aq-\-b 


and there are no more than q terms in the last sum. The proof is complete 
once we recall that \x( n )\ ^ 1- 

We can now prove the proposition. Let Sk = J2n=i X( n ), and sq = 0. 
We know that L(5, ％) is defined for 5 > 1 by the series 


n s 


which converges absolutely and uniformly for s > 5 > 1. Moreover, the 
differentiated series also converges absolutely and uniformly for s > S > 
1, which shows that i(5,x) is continuously differentiable for s > 1. We 
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sum by parts 2 to extend this result to 5 > 0. Indeed, we have 


N 


E 

k=l 


X(k) 

k s 


N 


E 

k=l 
N-l 

J 2 


Sk — -Sfc-l 

~~k s 


k=l 

N-l 


k s (k + l) s 


J2 九 '⑷ + 宗 


Sn 


where fk(s) = Sk [k~ s — (A: + l) _s ]. If g{x) — x~ s , then g’(x) = — 5X _S_1 , 
so applying the mean-value theorem between x = k and x = fc + 1, and 
the fact that |5fc| $ g, we find that 


l/fe(s)| < qsk—s— 1 . 


Therefore, the series /fc( s ) converges absolutely and uniformly for s > 
5 > 0, and this proves that L(s, x) is continuous for s > 0. To prove that 
it is also continuously differentiable, we differentiate the series term by 
term, obtaining 

Ed— 誓 

Again, we rewrite this series using summation by parts as 

二 s fc [-fc _s log k + (k + l) _s log(fc + 1)], 

and an application of the mean-value theorem to the function g{x )= 
x~ s logx shows that the terms are 0(A: -5 / 2-1 ), thus proving that the 
differentiated series converges uniformly for 5 > 5 〉 0. Hence L(5, x) is 
continuously differentiable for 5 > 0. 

Now, observe that for all s large, 


\L{s,x) - 1 | < 2 q n ~ S 

n=2 

<2- s O(l), 

and we can take c = log2, to see that L(s, x) = 1 + 0(e~ cs ) as s ^ oo. 
A similar argument also shows that I/(s，x) = 0(e _c s ) as s —^ oo with 
in fact d = c, and the proof of the proposition is complete. 


2 For the formula of summation by parts, see Exercise 7 in Chapter 2. 
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With the facts gathered so far about L[s,x) we are in a position to 
define the logarithm of the L-functions. This is done by integrating its 
logarithmic derivative. In other words, if % is a non-trivial Dirichlet 
character and 5 > 1 we define 3 * 

一 ：-n 


We know that L(t, x) 7 ^ 0 for every t > 1 since it is given by a product 
(Proposition 3.2), and the integral is convergent because 


L\t, x) 

L(t,x) 



which follows from the behavior at infinity of L(t, x) and L’(t, x) recorded 
earlier. 

The following links the two logarithms. 


Proposition 3.6 If s > 1, then 


e log 2 L (S , X ) 二咖欠 ). 


Moreover 


log 2 L( S ,x)-^lo gl ^ 


x(p)/p s 


Proof. Differentiating e _ log2 l ( s ’ x )L( 5, x) with respect to s gives 

_^^ e _ log2L(s ， x) +e _ log2L(s ， x)i ， = o . 

L{s,x) ' 

So e _log2L ( s ， x )L(5,x) is constant, and this constant can be seen to be 1 
by letting s tend to infinity. This proves the first conclusion. 

To prove the equality between the logarithms, we fix s and take the ex¬ 
ponential of both sides. The left-hand side becomes e log2 L ( s ， x) = L{s, x), 
and the right-hand side becomes 


e E P lo gi(i 


-x(p)/p s ) — 


n 


o l °Sl { 


-x(p)/p s 


n 


x(p)/p s 


L{s,x), 


3 The notation log 2 used in this context should not be confused with the logarithm to 

the base 2. 
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by (i) in Proposition 3.1 and the Dirichlet product formula. Therefore, 
for each 5 there exists an integer M(s) so that 



As the reader may verify, the left-hand side is continuous in 5, and this 
implies the continuity of the function M(s). But M(s) is integer-valued 
so we conclude that M(5) is constant, and this constant can be seen to 
be 0 by letting 5 go to infinity. 

Putting together the work we have done so far gives rigorous meaning 
to the formal argument presented earlier. Indeed, the properties of log! 
show that 



x(p) 


E 


+ 0 ( 1 ). 


Now if L(l, x) T 2 ^ 0 for a non-trivial Dirichlet character, then by its in¬ 
tegral representation log 2 L(s, x) remains bounded as 5 —^ 1 + . Thus 
the identity between the logarithms implies that x{p)p~ s remains 
bounded as s —• 1+, which is the desired result. Therefore, to finish the 
proof of Dirichlet ? s theorem, we need to see that L(l, x) 7^ 0 when x is 
non-trivial. 

3.3 Non-vanishing of the L-function 

We now turn to a proof of the following deep result: 

Theorem 3.7 If 乂妾 Xo, then L(l,x) ^ 0. 

There are several proofs of this fact, some involving algebraic number 
theory (among them Dirichlet’s original argument), and others involving 
complex analysis. Here we opt for a more elementary argument that 
requires no special knowledge of either of these areas. The proof splits 
in two cases, depending on whether % is complex or real. A Dirich¬ 
let character is said to be real if it takes on only real values (that is, 
+1, — 1, or 0) and complex otherwise. In other words, x is real if and 
only if x(n) = x( n ) f° r all integers n. 
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Case I: complex Dirichlet characters 

This is the easier of the two cases. The proof is by contradiction, and we 
use two lemmas. 


Lemma 3.8 If s > 1, then 


n L ( s >x) > i, 


where the product is taken over all Dirichlet characters. In particular the 
product is real-valued. 

Proof. We have shown earlier that for 5 > 1 

1 \\ 


L(s ， x) = exp J^log! 


x(p)p~ s 


Hence, 


II L ( s ， X) = exp ( E E lo gl ( - 


X P 


x(p)p~ s 


-P EEE 


X P k= 


1 x(p k ) 

k p 1 


,ks 


-P EEE 


v k=l x 


1 x(p k ) 

k p ks 


Because of Lemma 2.2 (with £ = 1) we have x(p k ) = ( 〜 (〆)，and 
hence 

n^x)=ex P U)Ef ： ^)>l ， 

x V p k=i k P J 

since the term in the exponential is non-negative. 


Lemma 3.9 The following three properties hold: 

(i) IfL(l,x) = 0, then L{l,x) = 0. 

(ii) If X non-trivial and L(l, x) = 0 ? then 

\L(s, x)\ ^ ^\ s ~ 1| when 1 < s <2. 














Ibookroot October 20, 2007 


3. Proof of the theorem 


267 


(iii) For the trivial Dirichlet character xo ， we have 


l[(s ， X0)| S 


when 1 < s <2. 


Proof. The first statement is immediate because L(l,%) = L(l, %). 
The second statement follows from the me an-value theorem since L(s, x) 
is continuously differentiable for 5 > 0 when % is non-trivial. Finally, the 
last statement follows because by Proposition 3.3 

L(s,Xo) = (1 -prKi-pD … .U - 〜 s )C(s), 
and C satisfies the similar estimate (3). 

We can now conclude the proof that L(l,x) 7^ 0 for x a non-trivial 
complex Dirichlet character. If not, say L(l, x) = 0, then we also have 
L(l, x) = 0. Since there are at least two terms in the product 

X 

that vanish like |«s — 1| as5—>l+. Since only the trivial character con¬ 
tributes a term that grows, and this growth is no worse than 0(1/ 丨 5 — 1|), 
we find that the product goes to 0 as 5 ― • 1+, contradicting the fact that 
it is > 1 by Lemma 3.8. 


Case II: real Dirichlet characters 

The proof that L(l, x) 7 ^ 0 when x is a non-trivial real Dirichlet character 
is very different from the earlier complex case. The method we shall 
exploit involves summation along hyperbolas. It is a curious fact that 
this method was introduced by Dirichlet himself, twelve years after the 
proof of his theorem on arithmetic progressions, to establish another 
famous result of his: the average order of the divisor function. However, 
he made no connection between the proofs of these two theorems. We will 
instead proceed by proving first Dirichlet’s divisor theorem, as a simple 
example of the method of summation along hyperbolas. Then, we shall 
adapt these ideas to prove the fact that L(l, x) 7^ 0. As a preliminary 
matter, we need to deal with some simple sums, and their corresponding 
integral analogues. 

Sums vs. Integrals 

Here we use the idea of comparing a sum with its corresponding integral, 
which already occurred in the estimate (3) for the zeta function. 
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Proposition 3.10 If N is a positive integer, then: 

r N dx 


⑴ E 


l<n<N 


ll x 


+ 0(1) = log TV+ 0(1). 


(ii) More precisely, there exists a real number 、 , called Euler 7 s constant, 
so that 


Y , -二 logiV + 7 + 0(l/iV)_ 


l<n<N 

Proof. It suffices to establish the more refined estimate given in 
part (ii). Let 

1 f n+1 dx 

In — -/ —— . 

n Jn X 

Since 1/x is decreasing, we clearly have 


0 < 7n < -- 


1 


< 


n n + 1 — n 2 ^ 

so the series 7 n converges to a limit which we denote by 7 . More¬ 

over, if we estimate ^ f{n) by f f (x) dx , where f(x) = 1/a: 2 , we find 


00 

n=N-\-l 


In 


< E 

n=N-\-l 


< 


r°° dx 

In x 2 


Therefore 


N 

E 


r N 


dx 


7. 


00 

n=N-\-l 


n Ji x 
and this last integral is 0(1/N) as TV —^ 00 


Oil/N). 

r N+1 dx 
In x ’ 


Proposition 3.11 If N is a positive integer, then 
▽ 1 f N dx 


l<n<N 


n 


1/2 


,1/2 


+ c' + Oil/N 1 / 2 ) 


2N 1 / 2 + c +Oil/N 1 / 2 ). 


The proof is essentially a repetition of the proof of the previous proposi¬ 
tion, this time using the fact that 


1 1 

~ (n+l)V2 


< 


C 

^72' 
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This last inequality follows from the mean-value theorem applied to 
f(x) = x -1 / 2 , between x = n and x = n-\-1. 

Hyperbolic sums 

If F is a function defined on pairs of positive integers, there are three 
ways to calculate 

Sn 二 

where the sum is taken over all pairs of positive integers (m, n) which 
satisfy mn < N. 

We may carry out the summation in any one of the following three 
ways. (See Figure 2.) 

(a) Along hyperbolas: 

Sn^ H F{m,n) 

l<k<.N \nm=k 

(b) Vertically: 

f F ( m ， n ) 

l<rn<N \l<n<JV/m 

(c) Horizontally: 

H f H F(rn,n) 

l<n<N \l<m<N/n 


It is a remarkable fact that one can obtain interesting conclusions from 
the obvious fact that these three methods of summation give the same 
sum. We apply this idea first in the study of the divisor problem. 

Intermezzo: the divisor problem 

For a positive integer fc, let d{k) denote the number of positive divisors 
of k. For example, 


k 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

d{k) 

1 

2 

2 

3 

2 

4 

2 

4 

3 

4 

2 

6 

2 

4 

4 

5 

2 
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Figure 2. The three methods of summation 


One observes that the behavior of d{k) as k tends to infinity is rather 
irregular, and in fact, it does not seem possible to approximate d{k) by 
a simple analytic expression in k. However, it is natural to inquire about 
the average size of d(k). In other words, one might ask, what is the 
behavior of 

1 N 

—^ d{k) as iV —• oo? 

k=l 

The answer was provided by Dirichlet, who made use of hyperbolic sums. 
Indeed, we observe that 

1. 

nm=k, l<n,m 

Theorem 3.12 If k is a positive integer, then 

i N 

— ^2d(k) ^logN + 0{l). 

k=l 

More precisely, 

1 N 

^ E d ( fc ) -logiV + (2 7 -l) + Oil/N 1 / 2 ), 
k=l 

where 7 is Euler’s constant. 

Proof. Let Sn = ^^=1 d(k). We observed that summing F = 1 along 
hyperbolas gives S^. Summing vertically, we find 

s n^ H 1 - 

l<m<N l<n<N/m 
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But J]i< n <iv/m 1 = [N/m] = N/m + 0(1), where [x] denote the greatest 
integer < x. Therefore 

Sn^ (iV/m + 0(l))-iV ( E 1/m) +0(N). 

l<m<N \l<m<N J 

Hence, by part (i) of Proposition 3.10, 

脊二 logiV + 0(l) 

which gives the first conclusion. 

For the more refined estimate we proceed as follows. Consider the 
three regions /, II, and III shown in Figure 3. These are defined by 

I = {1 < m < iV 1//2 ， TV 1 / 2 < n < N/m}, 
II^{l<m< iV" 2 , 1 < n < TV " 2 }， 

III 二 {N 1 / 2 < to < N/n, l<n< N 1 / 2 }. 



If Si, Sii, and Sm denote the sums taken over the regions /, II, and 
III, respectively, then 


Sn = Si Sn + Sm 

= 2(S I + S II )-S II , 
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since by symmetry Si = Sm ， Now we sum vertically, and use (ii) of 
Proposition 3.10 to obtain 


5 1 / + 5// = 1 

l<-m<N 1 / 2 \l<n<N/m ) 

lKmKN 1 / 2 

- E (N/m + 0(l)) 
lKmKN 1 / 2 

=N ( 1/m) -\-0{N 1/2 ) 

XlKmKN 1 / 2 ) 

= N\ogN 1/2 + 7V7 + 0(N 1/2 ). 

Finally, Sn corresponds to a square so 

S H = Y, 1 = [N 1 / 2 ] 2 ^N + OiN 1 / 2 ). 

lKmKN 1 / 2 l<n<N 1 /' 2 

Putting these estimates together and dividing by N yields the more re¬ 
fined statement in the theorem. 

Non-vanishing of the L- function 

Our essential application of summation along hyperbolas is to the main 
point of this section, namely that L(l, x) 7 ^ 0 for a non-trivial real Dirich- 
let character x. 

Given such a character, let 


F(m,n) = H 

and define 

Sn ^^2^2F(m,n) : 

where the sum is over all integers m, n > 1 that satisfy run < N. 
Proposition 3.13 The following statements are true: 

(i) Sn ^ clog N for some constant c > 0. 

(ii) S n ^2N 1 / 2 L(1, X ) + 0(1). 
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It suffices to prove the proposition, since the assumption L(l ， x) = 0 
would give an immediate contradiction. 

We first sum along hyperbolas. Observe that 


y- X(n) 
^ (nm) 1 / 2 


n\k 


For conclusion (i) it will be enough to show the following lemma. 


Lemma 3.14 x( n ) ^ 


n\k 


for all k 

if k = £ 2 for some ^ G Z. 


From the lemma, we then get 


S N > J2 ^ > Clog^ 
k=e 2 , ikn 1 / 2 


where the last inequality follows from (i) in Proposition 3.10. 

The proof of the lemma is simple. If fc is a power of a prime, say 
k = p a , then the divisors of k are l,p,p 2 ,... ,p a and 

xN = x(i) + x(p) + x(p 2 ) + ■ ■ ■ + x(p a ) 

n\k 

=1 + x(p) + x(p) 2 + ■ ■ ■ + x(p) a - 

So this sum is equal to 

( a+l if x(p) = 1, 

I 1 if x(p) = — 1 and a is even, 

I 0 if x(p) = -1 and a is odd, 

[ 1 if x(p) = 0? that is p\q. 


In general, if A: = p^ 1 - - - then any divisor of k is of the form p^ 1 - - - p b ^ 
where 0 < 6^ < aj for all j. Therefore, the multiplicative property of \ 
gives 

N 

=n ( 乂⑴ +xfe) + x(p 2 j) + … + x(p^ j )), 
n\k j=l 

and the proof is complete. 

To prove the second statement in the proposition, we write 


Sn = Si (Sn + Sm) 
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where the sums Si, 5//, and Sm were defined earlier (see also Figure 3). 
We evaluate Si by summing vertically, and Sn + Sm by summing hor¬ 
izontally. In order to carry this out we need the following simple results. 


Lemma 3.15 For all integers Q < a < b we have 



E 


X{n) 







Proof. This argument is similar to the proof of Proposition 3.4; we 
use summation by parts. Let s n = an d remember that 

|5 n | < q for all n. Then 


0 / \ o _ 丄 


n~ 1/2 - (n + 1)_ 1/2 ] + 0(a~ 1/2 ) 


= o 



+ 0(a~ 1/2 ). 


By comparing the sum n~ z ! 2 with the integral of f{x) = $— 3 / 2 , 

we find that the former is also 0(a _1 , 2 ). 

A similar argument establishes ⑻. 

We may now finish the proof of the proposition. Summing vertically 
we find 


Si = 


E 

rrKN 1 / 2 


1 

m 1 / 2 


[ X(n)/n 1/2 

^A^ 1 / 2 <n<AT/m 


The lemma together with Proposition 3.11 shows that Si = 0(1). Finally 
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we sum horizontally to get 


Sii + Siii 


E 


X(n) 


Kn<N^ nl；2 \ m <N/n 


l/m" 2 


E 


l<n<W 1 /2 


X(n) 


{2(A^/n) 1 / 2 + c+ 0((n/AT) 1/2 )} 


2N 1 / 2 y ^ + C y 

^ n ^ 


l<n<iV 1 / 2 


l<n<iV 1 / 2 


X(n) 


+ o 


a + b 十 a 


N 1 / 2 


Ei 


l<n<iW2 


Now observe that the lemma, together with the definition of L(s,x), 
implies 


A 二 27V" 2 L(1 ， x) + C^iV 1 / 2 #- 1 / 2 ). 

Moreover, part (i) of the lemma gives B = 0(1), and we also clearly 
have C = 0(1). Thus Sn = 2N 1 / 2 L(1, \) + 0(1), which is part (ii) in 
Proposition 3.13. 


This completes the proof that L(l, x) _ 0, and thus the proof of Dirich- 
let 5 s theorem. 

4 Exercises 


1. Prove that there are infinitely many primes by observing that if there were 
only finitely many, pi,... ,Piv, then 


N 

n 


1 - i /pj 



2. In the text we showed that there are infinitely many primes of the form 4A: + 3 
by a modification of Euclid’s original argument. Adapt this technique to prove 
the similar result for primes of the form 3k + 2, and for those of the form 6k + 5. 

3. Prove that if p and q are relatively prime, then Z*(p) x Z*(^) is isomorphic 
to 
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4. Let (f(n) denote the number of positive integers < n that are relatively prime 
to n. Use the previous exercise to show that if n and m are relatively prime, 
then 


(p(nm) — ip(n)ip(m). 


One can give a formula for the Euler phi-function as follows: 

(a) Calculate ^p(jp) when p is prime by counting the number of elements in 


z* ㈦. 


(b) Give a formula for (f(p k ) when p is prime and /c > 1 by counting the 
number of elements in Z*(p fc ). 

(c) Show that 



where pi are the primes that divide n. 


5. If n is a positive integer, show that 


n =Yl ^ 

d\n 


where (p is the Euler phi-function. 

[Hint: There are precisely (p(n/d) integers 1 < m < n with gcd(m, n) = d.] 

6. Write down the characters of the groups Z*(3), Z*(4), Z*(5), Z*(6), and 
Z*(8). 

(a) Which ones are real, or complex? 

(b) Which ones are even, or odd? (A character is even if x(—1) = 1, and odd 
otherwise). 

7. Recall that for \z\ < 1, 



; lo Sl( li) 


We have seen that 
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(a) Show that if w; = 1/(1 — 2 ：), then \z\ < 1 if and only if Ke(w) > 1/2. 

(b) Show that if Re(iy) > 1/2 and w = pe 1 ^ with p > 0, \(p\ < 7r, then 

\og 1 w = log p + i(f. 

[Hint: If =w, then the real part of ^ is uniquely determined and its 
imaginary part is determined modulo 2 丌 .] 


8. Let ( denote the zeta function defined for s > 1. 

(a) Compare ((s) with f^° x~ s dx to show that 

((s) 二 —^― - + 0(1) as s — > 1+. 
s — 1 


(b) Prove as a consequence that 


E 



+ 0 ( 1 ) 


as s ^ 1 + . 


9. Let xo denote the trivial Dirichlet character mod g, and pi, …, pk the distinct 
prime divisors of q. Recall that L(s, %o) = (1 — Pi S ) … （1 _ p: s )((s), and show 
as a consequence 

-^( s ?Xo) — W ⑷ - r + 0(1) as s —^ 1+. 

q s — 1 

[Hint: Use the asymptotics for ( in Exercise 8.] 

10 . Show that if £ is relatively prime to g, then 

log ( —+ 0(1) as s - 1+. 

^ p ^ V s -v 

This is a quantitative version of Dirichlet’s theorem. 

[Hint: Recall (4).] 

11. Use the characters for Z*(3), Z*(4), Z*(5), and Z*(6) to verify directly that 
L(l, x) 7 ^ 0 for all non-trivial Dirichlet characters modulo q when q = 3,4,5, 
and 6. 

[Hint: Consider in each case the appropriate alternating series.] 
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12. Suppose x is real and non-trivial; assuming the theorem that L(l,x) 〆 0, 
show directly that L(l,x) > 0. 

[Hint: Use the product formula for L(s,x).] 

13. Let {a n }^L_ 00 be a sequence of complex mimbors siiGh that a n 二 a m if 
n — m mod q. Show that the series 


oo 

E 

n=l 


71 


converges if and only if Y^n=i a n — 0- 
[Hint: Sum by parts.] 


14. The series 


__ pinO 

F ( d ) = Y ] ——， for |0| < 丌 , 
|n|#0 


converges for every 6 and is the Fourier series of the function defined on [—7r, tt] 
by F(0) = 0 and 


m = 


z(—7T — 6 ) 

i(n — 0) 


if —7T < ^ < 0 
if 0 < ^ < 7T, 


and extended by periodicity (period 2tt) to all of R (see Exercise 8 in Chapter 2). 
Show also that if 0 ^ 0 mod 2 丌 ， then the series 


oo 

m = J2 

n=l 


e in6 

n 


converges, and that 


E ⑼ = 2 l0g i 2- 2cos6», 




15. To sum the series a n /n, with a n = a m if n = m mod q and Yln=i a n 二 

0, proceed as follows. 

(a) Define 

Q 

A(m) = a„C mn where C = e 2?ri / ? . 
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Note that A(q) = 0. With the notation of the previous exercise, prove 
that 

oo q—1 

a ㈣ 五 ( 2 观 /« o . 

n=l m=l 

[Hint: Use Fourier inversion on Z(g).] 

(b) If {a m } is odd, (a_ m = —a m ) for m G Z, observe that — a q — and 
show that 

A{m)= Y, «n(C~ mn - C mn )- 

1<71<^/2 


(c) Still assuming that {a m } is odd, show that 


oo 

E 

n=l 


Tl 


1 ^ ^ 
q m=l 


[Hint: Define A(m) — X^n=i a nC mn and apply the Fourier inversion for¬ 
mula.] 


16. Use the previous exercises to show that 

7T 11111 

-— 1 - 1 - 1 - 1 - 

3v^ 2 4 5 7 8 1 

which is L(l,x) for the non-trivial (odd) Dirichlet character modulo 3. 


5 Problems 

1.* Here are other series that can be summed by the methods in Exercise 15. 

(a) For the non-trivial Dirichlet character modulo 6, L(l, %) equals 

7T 1111 

i_ 5 十〒 _n + ^ + .... 




(b) If % is the odd Dirichlet character modulo 8, then L(l, \) equals 

11111 


2\/2 


1 + 


3 


5 7 9 11 


(c) For an odd Dirichlet character modulo 7, L(l, x) equals 

11111 


7T 

7 


1 + 


2 3 4 5 


6 
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(d) For an even Dirichlet character modulo 8, L(l, %) equals 

log(l + V2) 1 1,1,1 1 

- 丄————一一—一一— ... 

y/2 3 5 7 9 11 

(e) For an even Dirichlet character modulo 5, L(l, x) equals 

2 n /l + \/5\ 1111111 1 


2. Let d(k) denote the number of positive divisors of k. 

(a) Show that if A: = p^ 1 - - is the prime factorization of k, then 

d(k) = (ai + 1) … （ a n + 1). 

Although Theorem 3.12 shows that on “average” d(k) is of the order of log /c, 
prove the following on the basis of (a): 

(b) d(k) — 2 for infinitely many k. 

(c) For any positive integer N ， there is a constant c > 0 so that d(k) > 

c(log k) N for infinitely many k. [Hint: Let pi,..., pn be N distinct primes, 
and consider k of the form (p±p 2 - - 'Pn) 771 for m = 1,2,_] 


3. Show that if p is relatively prime to q : then 



where g — y?(g)//, and / is the order of p in Z*(g) (that is, the smallest n for 
which 三 1 mod q). Here the product is taken over all Dirichlet characters 
modulo q. 

4. Prove as a consequence of the previous problem that 

]Jms ， x) = J2 芸， 

X ri>l 

where a n > 0, and the product is over all Dirichlet characters modulo q. 
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This appendix is meant as a quick review of the definition and main 
properties of the Riemann integral on R, and integration of appropriate 
continuous functions on Our exposition is brief since we assume that 
the reader already has some familiarity with this material. 

We begin with the theory of Riemann integration on a closed and 
bounded interval on the real line. Besides the standard results about 
the integral, we also discuss the notion of sets of measure 0, and give 
a necessary and sufficient condition on the set of discontinuities of a 
function that guarantee its integrability. 

We also discuss multiple and repeated integrals. In particular, we 
extend the notion of integration to the entire space by restricting 
ourselves to functions that decay fast enough at infinity. 

1 Definition of the Riemann integral 

Let / be a bounded real-valued function defined on the closed interval 
[a, b] C IR. By a partition P of [a, b] we mean a finite sequence of num¬ 
bers xq, xi, ... : xn with 


a = xq < xi < < xn-i < xn = b. 


Given such a partition, we let Ij denote the interval [xj-i,Xj] and write 
\Ij\ for its length, namely \Ij\ = Xj — Xj-\. We define the upper and 
lower sums of / with respect to P by 


以 ( p ,/) = H [ su p /h)] i j j'i 

7 = 1 XeI J 


and £(P, /) = ^ [ inf f(x)] \Ij\. 



Note that the infimum and sup remum exist because by assumption, / 
is bounded. Clearly ZY(P, /) > £(P, /), and the function / is said to be 
Riemann integrable, or simply integrable, if for every e > 0 there 
exists a partition P such that 


U(PJ) - C(PJ) < e. 


To define the value of the integral of /, we need to make a simple yet 
important observation. A partition P f is said to be a refinement of the 
partition P if P f is obtained from P by adding points. Then, adding one 
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point at a time, it is easy to check that 

U{P',f)< U(PJ) and £(P',f)> C{P,f). 


From this, we see that if Pi and P) are two partitions of [a, 6], 
then 


U(P 1 J)>C(P 2 J)^ 

since it is possible to take P r as a common refinement of both Pi and 尸 2 
to obtain 


W(Pi,/)> U(P'J) > C(P'J) > C(P 2 J). 


Since / is bounded we see that both 


U = inf f) and L = sup £(P, /) 
p p 


exist (where the infimum and supremum are taken over all partitions of 
[a, 6]), and also that U > L . Moreover, if / is integrable we must have 
U = L, and we define f(x) dx to be this common value. 

Finally, a bounded complex-valued function f = u-\- iv is said to be 
integrable if its real and imaginary parts u and v are integrable, and we 
define 


f{x) dx 


u{x) dx-\-i v{x) dx. 


For example, the constants are integrable functions and it is clear that 
if c G C, then cdx = c(b — a). Also, continuous functions are inte¬ 
grable. This is because a continuous function on a closed and bounded 
interval [a, b] is uniformly continuous, that is, given e > 0 there exists 5 
such that if \x — y\ < 5 then \ f(x) — f(y)\ < e. So if we choose n with 
(6 — a)/n < 5, then the partition P given by 

b — a b — a b — a 

a, a H - , • • •, a-\- k - , •.., a + (n — 1) - , b 

n n n 

satisfies U(P, f) — C(P, f) < e(b — a). 


1.1 Basic properties 

Proposition 1.1 If f and g are integrable on [a, b], then: 

(i) f g is integrable, and f(x) + g(x) dx = f(x) dx + g(x) dx. 
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(ii) // c G C ; then cf(x) dx = c f (x) dx. 

(iii) If f and g are real-valued and f(x) < g(x), then f(x) dx 
< Ia9(x)dx. 

(iv) // c G [a, b], then / a b f(x) dx = / a c f(x) dx + / c b f(x) dx. 

Proof. For property (i) we may assume that / and g are real-valued. 

If P is a partition of [a, 6], then 

U(PJ + g)< U(P ， f) + U(P,g) and £(P, f + g)>C{PJ) + C(P ， g). 

Given e 〉 0, there exist partitions Pi and P 2 such that U[P\^ f) — £(Pi, /) < 
e and U(P 2 ^g) — C{P 2 ^g) < e, so that if Pq is a common refinement of 
Pi and P 2 ? we get 


U(PoJ + g)-C(P 0 J + g)<2e. 

So / + ^ is integrable, and if we let / = inf p f -\- g) = sup P £(P, / + 
p), then we see that 


I< 略 , f + g) + 2e< U(PoJ) + U(P 0 ,g) + 2e 

广 6 广 6 

< / f(x) dx g(x) dx + 4e. 

J a J a 

Similarly / > /^ f{pc) dx + g(x) dx — 4e, which proves that f(x) + 
g(x) dx = f(x) dx + g(x) dx. The second and third parts of the 
proposition are just as easy to prove. For the last property, simply refine 
partitions of [a, b] by adding the point c. 


Another important property we need to prove is that fg is integrable 
whenever / and g are integrable. 


Lemma 1.2 If f is real-valued integrable on [a, b] and (p is a real-valued 
continuous function on then ip o f is also integrable on [a, b\. 

Proof. Let e > 0 and remember that / is bounded, say |/| < M. Since 
ip is uniformly continuous on [—M, M] we may choose 5 > 0 so that if 
G [—M, M] and \s — t\ <5, then |p(5) — < e. Now choose a par¬ 

tition P = {a:。, …, xjv} of [a, 6] with U(P, f) — £(P, /) < 5 2 . Let Ij = 
[xj-i^Xj] and distinguish two classes: we write j G A if sup&jj fix) - 
mf xe i j f(x) < 5 so that by construction 


sup o /(x) — inf (f o f(x)<e. 
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Otherwise, we write j G A f and note that 

J H Kl s H [ su p /o )- 辑 /⑷]叫 < 52 

so J2jeA f Kl < & Therefore, separating the cases j G A and j G A x we 
find that 

U(P, ” /) - £(P, ipof)<e(b-a) + 2B5, 

where 谷 is a bound for (f on [—M, M]. Since we can also choose 5 < e, 
we see that the proposition is proved. 

^From the lemma we get the following facts: 

• If / and g are integrable on [a, 6], then the product fg is integrable 
on [a, b]. 

This follows from the lemma with (p(t) = t 2 , and the fact that fg = 
\ ([/ + d] 2 ~ [f ~ d] 2 )- 

• If / is integrable on [a, 6], then the function |/| is integrable, and 

Jt f(x)dx < J^\f(x)\dx. 

We can take (f(t) = \t\ to see that |/| is integrable. Moreover, the in¬ 
equality follows from (iii) in Proposition 1.1. 

We record two results that imply integrability. 

Proposition 1.3 A bounded monotonic function f on an interval [a, b] 
is integrable. 

Proof. We may assume without loss of generality that a = 0, 6=1, 
and / is monotonically increasing. Then, for each iV, we choose the 
uniform partition Pn given by Xj = j/N for all j = 0,..., N. If aj = 
f(Xj), then we have 

! N ! N 

U(Pn ； /) = ^ an d 以 Pn ， f) = ^7 

j=i )=i 

Therefore, if \f(x)\ < B for all x we have 

略 , /) — C(P N J ) 二 


and the proposition is proved. 
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Proposition 1.4 Let f be a bounded function on the compact interval 
[a, b] . // c G (a, b), and if for all small 5 > 0 the function f is integrable 
on the intervals [a, c — S\ and [c + 5, b], then f is integrable on [a, b]. 

Proof. Suppose |/| < M and let e > 0. Choose 5 > 0 (small) so 
that 45M < e/3. Now let Pi and 巧 be partitions of [a, c — 5] and [c + 
5,b] so that for each i = 1， 2 we have U(Pi, f) — jC(Pi ， f) < e/3. This is 
possible since / is integrable on each one of the intervals. Then by taking 
as a partition P = PiU {c — 6} U {c-\- 6} U P 2 we immediately see that 
U(P ， f) — C(P,f)<e. 

We end this section with a useful approximation lemma. Recall that 
a function on the circle is the same as a 27r-periodic function on R. 

Lemma 1.5 Suppose f is integrable on the circle and f is bounded by 
B. Then there exists a sequence {fk}^=i of continuous functions on the 
circle so that 

sup \fk(x)\ < B for all k = 1,2,.. v 

xe[—n,7r] 

and 

\f(x) — //c(x)| dx —> 0 as k ^ 00 . 

Proof. Assume / is real-valued (in general apply the following argu¬ 
ment to the real and imaginary parts separately). Given e > 0, we may 
choose a partition — 丌 =xo < x± < ... < xjv = 丌 of the interval [—7r, 7r] 
so that the upper and lower sums of / differ by at most e. Denote by /* 
the step function defined by 

f*(x) = sup f(y) if x € for 1 < j < N. 

xj-i<y<xj 

By construction we have |/*| < B, and moreover 

(1) f \f*(x) - f(x)\dx^ I (f*(x) - f{x))dx < e. 

Now we can modify /* to make it continuous and periodic yet still ap¬ 
proximate / in the sense of the lemma. For small 5 > 0, let f(x) = /*(x) 
when the distance of x from any of the division points Xq, ..., xn is 
> 5. In the 5-neighborhood of Xj for j = 1,..., iV — 1, define f(x) to be 
the linear function for which f(Xj 士 5) = /*(x ) •士 5). Near x 0 = —n,f 
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r 


Xo XI X 2 X3 

X2 — S X2 ~\~ S 

Figure 1. Portions of the functions /* and / 


is linear with /(—7r) = 0 and /(—7r + 5) = /*(—7r + <5). Similarly, near 
xn = 7r the function / is linear with /(7r) = 0 and /(7r — 5) = /*(7r — 5). 
In Figure 1 we illustrate the situation near xq = —tt. In the second pic¬ 
ture the graph of / is shifted slightly below to clarify the situation. 

Then, since /(—7r) = /(7r), we may extend / to a continuous and 2tt- 
periodic function on R. The absolute value of this extension is also 
bounded by B. Moreover, f differs from /* only in the N intervals of 
length 2S surrounding the division points. Thus 

^ \r{x)-f(x)\dx<2BN{28). 

J —7T 

If we choose 5 sufficiently small, we get 

(2) f \f*(x) - f{x)\ dx < e. 

J —7T 

As a result, equations (1), (2), and the triangle inequality yield 
f \f{x) - f(x)\ dx < 2e. 

J —TV 



Denoting by fk the / so constructed, when 2e = 1/A:, we see that the 
sequence {fk} has the properties required by the lemma. 


1.2 Sets of measure zero and discontinuities of integrable func¬ 
tions 

We observed that all continuous functions are integrable. By modifying 
the argument slightly, one can show that all piecewise continuous func¬ 
tions are also integrable. In fact, this is a consequence of Proposition 1.4 
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applied finitely many times. We now turn to a more careful study of the 
discontinuities of integrable functions. 

We start with a definition 1 : a subset 五 of M is said to have measure 0 
if for every e > 0 there exists a countable family of open intervals {Ik}^ =1 
such that 

(i) EcUZih, 

(ii) J2T=i 叫 < where \Ik\ denotes the length of the interval Ik. 

The first condition says that the union of the intervals covers E, and the 
second that this union is small. The reader will have no difficulty proving 
that any finite set of points has measure 0. A more subtle argument is 
needed to prove that a countable set of points has measure 0. In fact, 
this result is contained in the following lemma. 

Lemma 1.6 The union of countably many sets of measure 0 has mea¬ 
sure 0. 

Proof. Say 五 2 , • • • are sets of measure 0, and let E = Ug ：1 ^. Let 
e > 0, and for each i choose open interval Ii,i ， Ii, 2 ,... so that 

oo oo 

五 i c IJ and J2 I4fcl < e / 2i - 

k=l k=l 

Now clearly we have E C (J^, =1 ;#， and 

oo oo oo 

^ - e > 

i=l k=l i=l 


as was to be shown. 

An important observation is that if E has measure 0 and is com¬ 
pact, then it is possible to find a finite number of open intervals Ik, 
k = 1 ， … ， iV, that satisfy the two conditions (i) and (ii) above. 

We can prove the characterization of Riemann integrable functions in 
terms of their discontinuities. 

Theorem 1.7 A bounded function f on [a, b] is integrable if and only if 
its set of discontinuities has measure 0. 


1 A systematic study of the measure of sets arises in the theory of Lebesgue integration, 
which is taken up in Book III. 
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We write J = [a, b] and /(c, r) = (c — r^c r) for the open interval 
centered at c of radius r > 0. Define the oscillation of / on /(c, r) by 

osc(/, c,r) = sup |/(a;) — f(y)\ 

where the supremum is taken over dll x,y E J 0 /(c,r). This quantity 
exists since / is bounded. Define the oscillation of / at c by 

osc(/, c) = lim osc(/, c, r). 

r—^0 

This limit exists because osc(/, c, r) is > 0 and a decreasing function of 
r. The point is that / is continuous at c if and only if osc(/, c) = 0. This 
is clear from the definitions. For each e > 0 we define a set A e by 


A e = {c e J : osc(/, c) > e}. 

Having done that, we see that the set of points in J where / is discon¬ 
tinuous is simply (J e>0 A e . This is an important step in the proof of our 
theorem. 

Lemma 1.8 If e > 0, then the set A e is closed and therefore compact. 

Proof. The argument is simple. Suppose c n G A e converges to c 
and assume that c ^ A e . Write osc(/ ， c) = e — 5 where 5 > 0. Select r 
so that osc(/ ， c,r) < e — 5/2, and choose n with |c n — c| < r/2. Then 
osc(/ ， c n , r/2) < e which implies osc(/ ， c n ) < e, a contradiction. 


We are now ready to prove the first part of the theorem. Suppose 
that the set V of discontinuities of / has measure 0, and let e > 0. 
Since A e C P, we can cover A e by a finite number of open intervals, 
say ii, … ， Jjv, whose total length is < e. The complement of this union 
I of intervals is compact, and around each point z in this complement we 
can find an interval F z with sup x??/GF2 |/(x) — f(y)\ < e, simply because 
z 车 A 。We may now choose a finite sub covering of \J ze icI Z) which we 
denote by ijv+i, … ， I N ，. Now, taking all the end points of the intervals 
we obtain a partition P of [a, b] with 


N 

C(PJ)<2Mj2\Ij\+<b - a) < Ce. 

J = 1 

Hence / is integrable on [a, 6], as was to be shown. 

Conversely, suppose that / is integrable on [a, 6], and let V be its 
set of discontinuities. Since V equals U^ ) =1 A 1 / n , it suffices to prove 
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that each A 1 / n has measure 0. Let e > 0 and choose a partition P = 
{xo, xi, ..., xat} so that U{P : f) — jC(P, f) < e/n. Then, if Ai/ n inter¬ 
sects Ij = we must have sup^.^^. f(x) — f(x) > 1/n, and 

this shows that 

I E \Ij\<U(P,f)-C(P,f)<e/n. 

U 0]：/门乂 l / n /0} 

So by taking intervals intersecting and making them slightly larger, 
we can cover Ai/ n with open intervals of total length < 2e. Therefore 
Ax i n has measure 0, and we are done. 

Note that incidentally, this gives another proof that fg is integrable 
whenever / and g are. 

2 Multiple integrals 

We assume that the reader is familiar with the standard theory of multi¬ 
ple integrals of functions defined on bounded sets. Here, we give a quick 
review of the main definitions and results of this theory. Then, we de¬ 
scribe the notion of “improper” multiple integration where the range of 
integration is extended to all of M. d . This is relevant to our study of the 
Fourier transform. In the spirit of Chapters 5 and 6, we shall define the 
integral of functions that are continuous and satisfy an adequate decay 
condition at infinity. 

Recall that the vector space consists of all d-tuples of real numbers 
x = (xi, X 2 ,.. •, Xd) with Xj G M, where addition and multiplication by 
scalars are defined componentwise. 

2.1 The Riemann integral in R d 
Definitions 

The notion of Riemann integration on a rectangle R C is an imme¬ 
diate generalization of the notion of Riemann integration on an interval 
[a, b] C M. We restrict our attention to continuous functions; these are 
always integrable. 

By a closed rectangle in we mean a set of the form 
R = {aj < Xj < bj ' 1 < j < d} 

where aj , bj G M for 1 < j < n. In other words, R is the product of the 
one-dimensional intervals [%•, bj]: 

R = WM] x ... x [a d: b d ]. 
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If Pj is a partition of the closed interval [%•, bj]^ then we call 
P = (Pi,... , Pd) a partition of i?; and if Sj is a subinterval of the 
partition Pj, then 5 = 5i x ■ • • x 5^ is a subrectangle of the partition 
P . The volume |5| of a subrectangle 5 is naturally given by the product 
of the length of its sides \S\ = |*Si| x ... x \Sd\, where denotes the 
length of the interval Sj. 

We are now ready to define the notion of integral over R. Given 
a bounded real-valued function / defined on R and a partition P, we 
define the upper and lower sums of / with respect to P by 

= UNp/W ] 问 and r ( 尸 , /) = E[ inf e / ㈤] I 5 !， 

xes x ^ s 

where the sums are taken over all subrectangles of the partition P. These 
definitions are direct generalizations of the analogous notions in one di¬ 
mension. 

A partition P r = (P{, … ， P^) is a refinement of P = (Pi, … ， P^) if 
each Pj is a refinement of Pj. Arguing with these refinements as we did 
in the one-dimensional case, we see that if we define 

U = inf U(P ， f) and L = sup C(P ， f), 

p p 

then both U and L exist, are finite, and U > L. We say that / is Rie- 
mann integrable on R if for every e > 0 there exists a partition P so 
that 


U(P,f)-£(PJ)<e. 

This implies that U = L, and this common value, which we shall denote 
by either 

/ f(x 1 ,...,x d )dx 1 --dx d , / f(x) dx, or 

Jr Jr 

is by definition the integral of / over R. If / is complex-valued, say 
f(x) = u{x) + iv(x), where u and v are real-valued, we naturally define 

/ f(x) dx = u{x) dx-\-i v(x) dx. 

Jr Jr Jr 

In the results that follow, we are primarily interested in continuous 
functions. Clearly, if / is continuous on a closed rectangle R then / is 
integrable since it is uniformly continuous on R. Also, we note that if 
/is continuous on, say, a closed ball B, then we may define its integral 
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over B in the following way: if g is the extension of / defined by g{x) = 0 
if x 丰 B, then g is integrable on any rectangle R that contains B, and 
we may set 



2.2 Repeated integrals 

The fundamental theorem of calculus allows us to compute many one 
dimensional integrals, since it is possible in many instances to find an 
antiderivative for the integrand. In this permits the calculation of 
multiple integrals, since a d-dimensional integral actually reduces to d 
one-dimensional integrals. A precise statement describing this fact is 
given by the following. 

Theorem 2.1 Let f be a continuous function defined on a closed rect¬ 
angle R C M d . Suppose R = Ri x R 2 where R\ C M dl and i?2 C M^ 2 
with d = d\ -\- d 2 - If we write x = (xi, X2) with Xi G M 山， then F{x\) = 
J R2 /(xi, ^ 2 ) dx 2 is continuous on R\, and we have 



Proof. The continuity of F follows from the uniform continuity of / 
on R and the fact that 


書乂 


\f(x 1 ,x 2 ) - f{x' 1 ,x 2 )\ dx 2 . 


I 尸 ㈤ 


To prove the identity, let Pi and P 2 be partitions of R\ and i? 2 , respec¬ 
tively. If S and T are subrectangles in Pi and P 2 , respectively, then the 
key observation is that 


sup /(xi,x 2 ) > f{x x ,x^) 

SxT 



and 
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Then, 


= L[sup f(x 1 ,x 2 )} \s x T\ 


S,T 


SxT 


SU P I SU P /(a ： i,a ； 2 )] \T\ x (S 1 ! 

St X2^：T 


> 


su p 


X±^S \*/ i?2 


f(x 1 ,x 2 ) dx 2 ) \S\ 


>U[Pi, / /(xi,x 2 ) dx 2 ). 


/丑 2 


Arguing similarly for the lower sums, we find that 


£(P,/)<£(Px, / f(x u x 2 )dx 2 )<U(P u / f(x 1 ,x 2 )dx 2 )<U(PJ), 


f R2 


， R2 


and the theorem follows from these inequalities. 


Repeating this argument, we find as a corollary that if / is continuous 
on the rectangle R C~R d given by i? = [ai, 6i] x • • • [a^, 6^], then 




, Xd) dxd ) ... dx2 ) dxi. 


where the right-hand side denotes d-iterates of one-dimensional integrals. 
It is also clear from the theorem that we can interchange the order of 
integration in the repeated integral as desired. 


2.3 The change of variables formula 

A diffeomorphism of class C 1 , g : A ^ B : is a mapping that is contin¬ 
uously differentiable, invertible, and whose inverse g~ x : B ^ A is also 
continuously differentiable. We denote by Dg the Jacobian or derivative 
of g. Then, the change of variables formula says the following. 

Theorem 2.2 Suppose A and B are compact subsets of and 
g \ A ^ B is a diffeomorphism of class C 1 . If f is continuous on B, 
then 


[f{x)dx= [ f(g(y))\det(Dg)(y)\dy. 

•/ 分 (A) J A 

The proof of this theorem consists first of an analysis of the special 
situation when ^ is a linear transformation L. In this case, if i? is a 
rectangle, then 


| 刺卜 jdet ( 響 I 
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which explains the term | det(Dg)\. Indeed, this term corresponds to the 
new infinitesimal element of volume after the change of variables. 


2.4 Spherical coordinates 

An important application of the change of variables formula is to the case 
of polar coordinates in M 2 , spherical coordinates in R 3 , and their general¬ 
ization in M. d . These are particularly important when the function, or set 
we are integrating over, exhibit some rotational (or spherical) symme¬ 
tries. The cases d = 2 and d = 3 were given in Chapter 6 . More generally, 
the spherical coordinates system in is given by a; = g(r, 0 i, •.. ， ^d-l) 
where 


Xl -- 

=r sin / 

9i sin 62 -- 

.sin 9 d -2 cos 6 d-i 

X 2 z 

=r sin / 

?i sin 62 -- 

• sin 9 d -2 sin 6 d-i- 

Xd -1 

=r sin / 

?i sin (9 2 , 



=r cos 



with 0 < 9i < 7 r for 1 < i < d — 2 and 0 < Od-i < 27r. The determinant 
of the Jacobian of this transformation is given by 

r d — 1 sm d ~ 2 9i sin d - 3 02 ... sin 0 d- 2 - 


Any point in x G — {0} can be written uniquely as 7*7 with 7 G S d 
the unit sphere in M d . If we define 



/( 7 ) da(-f) 




f(g(r, 6 )) sm d ~ 2 sm d ~ 3 0 2 


sin e d - 2 d9 d -i 




then we see that if B(0, N) denotes the ball of radius N centered at the 
origin, then 


r N 


(3) 


,b(q,n) 


f(x) dx 


Isd - 1 Jo 


/(r* 7 ) r d_1 dr 


In fact, we define the area of the unit sphere 1 C as 


= 



An important application of spherical coordinates is to the calculation 
of the integral f A ^ Ri \x\ x dx, where A(Ri ， R 2 ) denotes the annulus 
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A(Ri ， R 2 ) = {i?i < \x\ < R 2 } and A G M. Applying polar coordinates, 
we find 


[ |o:| A dx = 

f [ r A+d_1 drdcr{^) 

J AdR】) 

JS ^ 1 JRx 


Therefore 





患[琦 +d - Ri +d ] 

uj d [log(R 2 ) - log(i?i)] 


if A # 
if A = -d. 


3 Improper integrals. Integration over R d 

Most of the theorems we just discussed extend to functions integrated 
over all of once we impose some decay at infinity on the functions we 
integrate. 


3.1 Integration of functions of moderate decrease 

For each fixed N > 0 consider the closed cube in centered at the origin 
with sides parallel to the axis, and of side length N: Qn = {\xj\ < N/2 : 
I < j < d}. Let / be a continuous function on M. d . If the limit 

lim [ f(x) dx 

n ^Jq n 

exists, we denote it by 

f f(x)dx. 

JR d 

We deal with a special class of functions whose integrals over M. d exist. 
A continuous function / on is said to be of moderate decrease if 
there exists A > 0 such that 


l/d 

Note that if d = 1 we recover the definition given in Chapter 5. An 
important example of a function of moderate decrease in M is the Poisson 
kernel given by V v (x)= 

We claim that if / is of moderate decrease, then the above limit exists. 
Let In = Jq f(x)dx. Each In exists because / is continuous hence 
integrable. For M > TV, we have 

\Im ~ In\< ( \f( x )\dx. 

J Qm — Qn 
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Now observe that the set Qm — Qn is contained in the annulus 
A(aN\bM) = {aN < \x\ < bM}, where a and b are constants that de¬ 
pend only on the dimension d. This is because the cube Qat is contained 
in the annulus N/2 < \x\ < N^/d/2, so that we can take a = 1/2 and 
b = Vd/2. Therefore, using the fact that / is of moderate decrease yields 

\Im ~ In\<A j \A~ d ~ l dx - 

JaN<\x\<bM 

Now putting A = —d — 1 in the calculation of the integral of the previous 
section, we find that 

So if / is of moderate decrease, we conclude that {In}^ =1 is a Cauchy 
sequence, and therefore f Rd f(x) dx exists. 

Instead of the rectangles Qn, we could have chosen the balls Bn cen¬ 
tered at the origin and of radius N. Then, if / is of moderate decrease, 
the reader should have no difficulties proving that lim^v— ^oo Sb n /( x ) dx 
exists, and that this limit equals limTv-^oo Jq n /($) dx. 

Some elementary properties of the integrals of functions of moderate 
decrease are summarized in Chapter 6. 


3.2 Repeated integrals 

In Chapters 5 and 6 we claimed that the multiplication formula held for 
functions of moderate decrease. This required an appropriate interchange 
of integration. Similarly for operators defined in terms of convolutions 
(with the Poisson kernel for example). 

We now justify the necessary formula for iterated integrals. We only 
consider the case d = 2, although the reader will have no difficulty ex¬ 
tending this result to arbitrary dimensions. 

Theorem 3.1 Suppose f is continuous on M 2 and of moderate decrease. 
Then 

= / f(x 1 ,x 2 )dx 2 
Jr 


is of moderate decrease on and 


/ f(x)dx= [ f(x 1 ,x 2 )dx 2 dxi. 

/e 2 Jr \Jr J 
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Proof. To see why F is of moderate decrease, note first that 

Adx2 


i^i)i < 


< 


/ R l + (xf + ^)3/2 - 

J\x 2 \<\xi\ *^k 2 |>kl| 

In the first integral, we observe that the integrand is < A/(l + |^i| 3 ), so 
f A dx 。 A f A! 


< 


^\x 2 \<\xi\ 1 H - { x i X 2) 3 / 2 1 + 1^11 3 *^|x 2 |<|a ； i| 

For the second integral, we have 

f Adx2 f dx 。 


dx2 < 


1 + |xi| 2 * 


l\x 2 \>\ Xl \ 1 + (xf + X 2 2 f/ 2 


< 


< 


A m 


hx2\>\ Xl \ 1 + M 3 _ ki| 


thus F is of moderate decrease. In fact, this argument together with 
Theorem 2.1 shows that F is the uniform limit of continuous functions, 
thus is also continuous. 

To establish the identity we simply use an approximation and Theo¬ 
rem 2.1 over finite rectangles. Write S c to denote the complement of a 
set S. Given e > 0 choose N so large that 


/ /(xi,X2) dx\dx2 — / f(xi,X2) dx\dx2 
/r 2 JInxIn 

where In = [—iV, N]. Now we know that 


< e, 


/(xi,x 2 ) dxidx 2 


JInxIn 」 In In 

But this last iterated integral can be written as 


f(xi,x 2 ) dx 2 ) dxi ， 


/R \JR 


We can now estimate 


f(xi,x 2 )dx 2 ) dxx 




f(xi,x 2 )dx 2 ) dxi 


，t n \ j i c n 


f(x 1 ,x 2 )dx 2 dxi 




f(xi,x 2 )dx 2 dxi 


< O 


N 2 


+ C 


dxo 


^-<\xi\<N (l^ll + bl|) 3 


dx\ 
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f(x 1 ,x 2 )dx 2 ) dxi 


Therefore, we can find N so large that 




/ f(x 1 ,x 2 ) dxidx 2 - / / f{x 1 ,x 2 )dx 2 dxi 

f In 乂 In V*/M / 


< e, 


and we are done. 


3.3 Spherical coordinates 


In spherical coordinates are given by x = 7 * 7 , where r > 0 and 7 
belongs to the unit sphere S d-1 . If / is of moderate decrease, then 
for each fixed 7 G S d_1 , the function of / given by /(r* 7 )r d_1 is also of 
moderate decrease on M. Indeed, we have 


fin) 


d-1 


^d-l 


< A- 


< 


B 


+ | ， 7 | 朴 1 _ 1 + r 2 


As a result, by letting i? —> 00 in (3) we obtain the formula 


/(^) dx 


/(r* 7 ) r d ~ L dr dcr(^). 


JR d JS ^ 1 JO 

As a consequence, if we combine the fact that 


/ f(R(x))dx= f(x) dx, 
jR d JR d 

whenever i? is a rotation, with the identity (3), then we obtain that 


⑷ 


I s d - 


/(i?(7)) da(7) 


: j sd —J ㈧ 峨 . 
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Seeley [29] gives an elegant and brief introduction to Fourier series and the 
Fourier transform. The authoritative text on Fourier series is Zygmund [36]. 
For further applications of Fourier analysis to a variety of other topics, see Dym 
and McKean [8] and Korner [21]. The reader should also consult the book by 
Kahane and Lemarie-Rieusset [20], which contains many historical facts and 
other results related to Fourier series. 

Chapter 1 

The citation is taken from a letter of Fourier to an unknown correspondent 
(probably Lagrange), see Herivel [15]. 

More facts about the early history of Fourier series can be found in Sections 
I-III of Riemann’s memoir [27]. 

Chapter 2 

The quote is a translation of an excerpt in Riemann’s paper [27]. 

For a proof of Littlewood’s theorem (Problem 3), as well as other related 
“Tauberian theorems,” see Chapter 7 in Titchmarsh [32]. 

Chapter 3 

The citation is a translation of a passage in Dirichlet’s memoir [6]. 

Chapter 4 

The quote is translated from Hurwitz [17]. 

The problem of a ray of light reflecting inside a square is discussed in Chap¬ 
ter 23 of Hardy and Wright [13]. 

The relationship between the diameter of a curve and Fourier coefficients 
(Problem 1) is explored in Pfluger [26]. 

Many topics concerning equidistribution of sequences, including the results in 
Problems 2 and 3, are taken up in Kuipers and Niederreiter [22]. 

Chapter 5 

The citation is a free translation of a passage in Schwartz [28]. 

For topics in finance, see Duffie [7], and in particular Chapter 5 for the Black- 
Scholes theory (Problems 1 and 2). 

The results in Problems 4, 5, and 6 are worked out in John [19] and Wid- 
der [34]. 

For Problem 7, see Chapter 2 in Wiener [35]. 

The original proof of the nowhere differentiability of /i (Problem 8) is in 
Hardy [12]. 

Chapter 6 

The quote is an excerpt from Cormack’s Nobel Prize lecture [5]. 

More about the wave equation, as well as the results in Problems 3, 4, and 5 
can be found in Chapter 5 of Folland [9]. 
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A discussion of the relationship between rotational symmetry, the Fourier 
transform, and Bessel functions is in Chapter 4 of Stein and Weiss [31]. 

For more on the Radon transform, see Chapter 1 in John [18], Helgason [14], 
and Ludwig [25]. 

Chapter 7 

The citation is taken from Bingham and Tukey [2]. 

Proofs of the structure theorem for finite abelian groups (Problem 2) can be 
found in Chapter 2 of Herstein [16], Chapter 2 in Lang [23], or Chapter 104 in 
Korner [21]. 

For Problem 4, see Andrews [1], which contains a short proof. 

Chapter 8 

The citation is from Bochner [3]. 

For more on the divisor function, see Chapter 18 in Hardy and Wright [13]. 
Another “elementary” proof that L(l ， x) # 0 can be found in Chapter 3 of 
Gelfond and Linnik [11]. 

An alternate proof that L(l ， x) _ 0 based on algebraic number theory is in 
Weyl [33]. Also, two other analytic variants of the proof that L(l, X) 笋 0 can be 
found in Chapter 109 in Korner [21] and Chapter 6 in Serre [30]. See also the 
latter reference for Problems 3 and 4. 

Appendix 

Further details about the results on integration reviewed in the appendix can 
be found in Folland [10] (Chapter 4), Buck [4] (Chapter 4), or Lang [24] (Chap¬ 
ter 20). 
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Symbol Glossary 


The page numbers on the right indicate the first time the symbol or 
notation is defined or used. As usual, Z, Q, M and C denote the integers, 
the rationals, the reals, and the complex numbers respectively. 


A 

Laplacian 

20, 185 

14 乏 

Absolute value and complex 
conjugate 

22 

e z 

Complex exponential 

24 

sinhx, coshz 

Hyperbolic sine and hyperbolic 
cosine 

28 

fin), a n 

Fourier coefficient 

34 

_ 〜 J ： a n e ine 

Fourier series 

34 

s N (f) 

Partial sum of a Fourier series 

35 

Dn, D n , D* n 

Dirichlet kernel, conjugate, and 
modified 

37, 95, 165 

Pr,Vy, r { y d) 

Poisson kernels 

37, 149, 210 

0, o 

Big O and little o notation 

42, 62 

c k 

Functions that are k times dif¬ 
ferentiable 

44 

f^9 

Convolution 

44, 139, 184, 239 

GN, CTN(f) 

Cesaro mean 

52, 53 

Fn ， 

Fejer kernel 

53, 163 

4(r), A r (f) 

Abel mean 

54, 55 

X[a,b] 

Characteristic function 

61 

f(o + ), f(e~) 

One-sided limits at jump dis¬ 
continuities 

63 

R d , C d 

Euclidean spaces 

71 

X 丄 Y 

Orthogonal vectors 

72 

l 2 (Z) 

Square summable sequences 

73 

n 

Riemann integrable functions 

75 

C(s) 

Zeta function 

98 

[ 工 ] ，㈤ 

Integer and fractional parts 

106 

△iV ， ^N,K, Ajv 

Delayed means 

114, 127, 174 

tt n_/ o_/(^) 

J^t-, rlt-i 

Heat kernels 

120, 146, 209 

A4(M) 

Space of functions of moderate 
decrease on M 

131 
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m) 

Fourier transform 

134, 

181 

S,S(R),S(R d ) 

Schwartz space 

134, 

180 


Upper half-plane and its 
closure 


149 

i?(s), 0(z|r) 

Theta functions 

155, 

156 

r ⑷ 

Gamma function 


165 

||®||， | 叫； (x,y), x - y 

Norm and inner product in 

R d 

71, 

176 

#， k (基 r 

Monomial, its order, and 
differential operator 


176 

S 1 , S 2 , S^ 1 

M t , M t 

Unit circle in M 2 , and unit 
spheres in R 3 , M. d 

179, 

180 

Spherical mean 

194, 

216 

Jn 

Bessel function 

197, 

213 


Plane 


202 

n, n* 

Radon and dual Radon 
transforms 

203, 

205 

A d , v d 

Area and volume of the 
unit sphere in 


208 

Z(N) 

Group of N th roots of 
unity 


219 

Z/NZ 

Group of integers modulo 
N 


221 

G, |G| 

Abelian group and its or¬ 
der 

226, 

228 

GkH 

Isomorphic groups 


227 

G\ x G2 

Direct product of groups 


228 

Z* ⑷ 

Group of units modulo q 

227, 229, 

244 

G 

Dual group of G 


231 

a\b 

a divides b 


242 

gcd(a, b) 

Greatest common divisor 
of a and b 


242 

w ⑷ 

Number of integers rela¬ 
tively prime to q 


254 

X， Xo 

Dirichlet character, and 
trivial Dirichlet character 


254 

L(s,x) 

Dirichlet L-function 


256 

logi (y^J, log 2 L(s,x) 

Logarithms 

258, 

264 

d(k) 

Number of positive divi- 


269 


sors of k 
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Index 


Relevant items that also arise in Book I are listed in this index, preceeded 
by the numeral I. 


Abel 

means, 54 
summable, 54 
abelian group, 226 
absolute value, 23 
absorption coefficient, 199 
amplitude, 3 

annihilation operator, 169 
approximation to the identity, 
49 

attenuation coefficient, 199 

Bernoulli 

numbers, 97, 167 
polynomials, 98 
Bernstein’s theorem, 93 
Bessel function, 197 
Bessel’s inequality, 80 
best approximation lemma, 78 
Black-Scholes equation, 170 
bump functions, 162 

Cauchy problem (wave equa¬ 
tion), 185 

Cauchy sequence, 24 
Cauchy-Schwarz inequality, 72 
Cesaro 

means, 52 
sum, 52 
summable, 52 
character, 230 

trivial (unit), 230 
class 44 


closed rectangle, 289 
complete vector space, 74 
complex 

conjugate, 23 
exponential, 24 
congruent integers, 220 
conjugate Dirichlet kernel, 95 
convolution, 44, 139, 239 
coordinates 

spherical in 293 
creation operator, 169 
curve, 102 

area enclosed, 103 
closed, 102 
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length, 102 
simple, 102 

d’Alembert’s formula, 11 
delayed means, 114 
generalized, 127 
descent (method of), 194 
dilations, 133, 177 
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real, 265 
Dirichlet kernel 
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modified (on the real line), 
165 

on the circle, 37 
Dirichlet problem 
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annulus, 64 
rectangle, 28 
strip, 170 
unit disc, 20 

Dirichlet product formula, 256 
Dirichlefs test, 60 
Dirichlet’s theorem, 128 
discontinuity 
jump, 63 

of a Riemann integrable 
function, 286 

divisibility of integers, 242 
divisor, 242 

greatest common, 242 
divisor function, 269, 280 
dual 

X-ray transform, 212 
group, 231 

Radon transform, 205 

eigenvalues and eigenvectors, 
233 

energy, 148, 187 
of a string, 90 

equidistributed sequence, 107 
ergodicity, 111 
Euclid’s algorithm, 241 
Euler 

constant 7 , 268 
identities, 25 
phi-function, 254, 276 
product formula, 249 
even function, 10 
expectation, 160 
exponential function, 24 
exponential sum, 112 

fast Fourier transform, 224 
Fejer kernel 

on the circle, 53 
on the real line, 163 
Fibonacci numbers, 122 
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Fourier series, 34, 235 
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Cesaro mean, 53 
delayed means, 114 
generalized delayed means, 
127 
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partial sum, 35 
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Fourier series convergence 
mean square, 70 
pointwise, 81, 128 
Fourier series divergence, 83 
Fourier transform, 134, 136, 181 
fractional part, 106 
function 

Bessel, 197 
exponential, 24 
gamma, 165 

moderate decrease, 131, 
179, 294 
radial, 182 

rapidly decreasing, 135, 178 
sawtooth, 60, 83 
Schwartz, 135, 180 
theta, 155 
zeta, 98 

gamma function, 165 
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d-dimensions, 209 
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122 
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131 
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overtones, 6, 13 
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part 
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Poincare^ inequality, 90 
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cTAlembert’s formula, 11 
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Foreword 


Beginning in the spring of 2000, a series of four one-semester courses 
were taught at Princeton University whose purpose was to present, in 
an integrated manner, the core areas of analysis. The objective was to 
make plain the organic unity that exists between the various parts of the 
subject, and to illustrate the wide applicability of ideas of analysis to 
other fields of mathematics and science. The present series of books is 
an elaboration of the lectures that were given. 

While there are a number of excellent texts dealing with individual 
parts of what we cover, our exposition aims at a different goal: pre¬ 
senting the various sub-areas of analysis not as separate disciplines, but 
rather as highly interconnected. It is our view that seeing these relations 
and their resulting synergies will motivate the reader to attain a better 
understanding of the subject as a whole. With this outcome in mind, we 
have concentrated on the main ideas and theorems that have shaped the 
field (sometimes sacrificing a more systematic approach), and we have 
been sensitive to the historical order in which the logic of the subject 
developed. 

We have organized our exposition into four volumes, each reflecting 
the material covered in a semester. Their contents may be broadly sum¬ 
marized as follows: 

I. Fourier series and integrals. 

II. Complex analysis. 

III. Measure theory, Lebesgue integration, and Hilbert spaces. 

IV. A selection of further topics, including functional analysis, distri¬ 
butions, and elements of probability theory. 

However, this listing does not by itself give a complete picture of 
the many interconnections that are presented, nor of the applications 
to other branches that are highlighted. To give a few examples: the ele¬ 
ments of (finite) Fourier series studied in Book I, which lead to Dirichlet 
characters, and from there to the infinitude of primes in an arithmetic 
progression; the X-ray and Radon transforms, which arise in a number of 


Vll 


viii FOREWORD 

problems in Book I, and reappear in Book III to play an important role in 
understanding Besicovitch-like sets in two and three dimensions; Fatou’s 
theorem, which guarantees the existence of boundary values of bounded 
holomorphic functions in the disc, and whose proof relies on ideas devel¬ 
oped in each of the first three books; and the theta function, which first 
occurs in Book I in the solution of the heat equation, and is then used 
in Book II to find the number of ways an integer can be represented as 
the sum of two or four squares, and in the analytic continuation of the 
zeta function. 

A few further words about the books and the courses on which they 
were based. These courses where given at a rather intensive pace, with 48 
lecture-hours a semester. The weekly problem sets played an indispens¬ 
able part, and as a result exercises and problems have a similarly im¬ 
portant role in our books. Each chapter has a series of “Exercises” that 
are tied directly to the text, and while some are easy, others may require 
more effort. However, the substantial number of hints that are given 
should enable the reader to attack most exercises. There are also more 
involved and challenging “Problems ”； the ones that are most difficult, or 
go beyond the scope of the text, are marked with an asterisk. 

Despite the substantial connections that exist between the different 
volumes, enough overlapping material has been provided so that each of 
the first three books requires only minimal prerequisites: acquaintance 
with elementary topics in analysis such as limits, series, differentiable 
functions, and Riemann integration, together with some exposure to lin¬ 
ear algebra. This makes these books accessible to students interested 
in such diverse disciplines as mathematics, physics, engineering, and 
finance, at both the undergraduate and graduate level. 

It is with great pleasure that we express our appreciation to all who 
have aided in this enterprise. We are particularly grateful to the stu¬ 
dents who participated in the four courses. Their continuing interest, 
enthusiasm, and dedication provided the encouragement that made this 
project possible. We also wish to thank Adrian Banner and Jose Luis 
Rodrigo for their special help in running the courses, and their efforts to 
see that the students got the most from each class. In addition, Adrian 
Banner also made valuable suggestions that are incorporated in the text. 
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We wish also to record a note of special thanks for the following in¬ 
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ing part of the manuscript taught several weeks of one of the courses, and 
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least, our thanks go to Gerree Pecht, for her consummate skill in type¬ 
setting and for the time and energy she spent in the preparation of all 
aspects of the lectures, such as transparencies, notes, and the manuscript. 
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Introduction 


... In effect, if one extends these functions by allowing 
complex values for the arguments, then there arises 
a harmony and regularity which without it would re¬ 
main hidden. 

B. Riemann, 1851 


When we begin the study of complex analysis we enter a marvelous 
world, full of wonderful insights. We are tempted to use the adjectives 
magical, or even miraculous when describing the first theorems we learn; 
and in pursuing the subject, we continue to be astonished by the elegance 
and sweep of the results. 

The starting point of our study is the idea of extending a function 
initially given for real values of the argument to one that is defined when 
the argument is complex. Thus, here the central objects are functions 
from the complex plane to itself 

f : C 4 C, 


or more generally, complex-valued functions defined on open subsets of C. 
At first, one might object that nothing new is gained from this extension, 
since any complex number z can be written as z = x iy where x,y 
and z is identified with the point (x,y) in M 2 . 

However, everything changes drastically if we make a natural, but 
misleadingly simple-looking assumption on /: that it is differentiable 
in the complex sense. This condition is called holomorphicity , and it 
shapes most of the theory discussed in this book. 

A function / : C —>• C is holomorphic at the point z G C if the limit 


lim 

/i— 


f{z + h) - f(z) 
h 


(heC) 


exists. This is similar to the definition of differentiability in the case of 
a real argument, except that we allow h to take complex values. The 
reason this assumption is so far-reaching is that, in fact, it encompasses 
a multiplicity of conditions: so to speak, one for each angle that h can 
approach zero. 
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Although one might now be tempted to prove theorems about holo- 
morphic functions in terms of real variables, the reader will soon discover 
that complex analysis is a new subject, one which supplies proofs to the 
theorems that are proper to its own nature. In fact, the proofs of the 
main properties of holomorphic functions which we discuss in the next 
chapters are generally very short and quite illuminating. 

The study of complex analysis proceeds along two paths that often 
intersect. In following the first way, we seek to understand the univer¬ 
sal characteristics of holomorphic functions, without special regard for 
specific examples. The second approach is the analysis of some partic¬ 
ular functions that have proved to be of great interest in other areas of 
mathematics. Of course, we cannot go too far along either path without 
having traveled some way along the other. We shall start our study with 
some general characteristic properties of holomorphic functions, which 
are subsumed by three rather miraculous facts: 

1. Contour integration ： If / is holomorphic in then for appro¬ 
priate closed paths in f] 



2. Regularity: If / is holomorphic, then / is indefinitely differen¬ 


tiable. 


3. Analytic continuation ： If / and g are holomorphic functions 
in f] which are equal in an arbitrarily small disc in fi, then f = g 
everywhere in f2. 

These three phenomena and other general properties of holomorphic 
functions are treated in the beginning chapters of this book. Instead 
of trying to summarize the contents of the rest of this volume, we men¬ 
tion briefly several other highlights of the subject. 

• The zeta function, which is expressed as an infinite series 



and is initially defined and holomorphic in the half-plane Re(s) > 1, 
where the convergence of the sum is guaranteed. This function 
and its variants (the L-series) are central in the theory of prime 
numbers, and have already appeared in Chapter 8 of Book I, where 
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we proved Dirichlet’s theorem. Here we will prove that ( extends to 
a meromorphic function with a pole at 5 = 1. We shall see that the 
behavior of ^( 5 ) for Re(s) = 1 (and in particular that C does not 
vanish on that line) leads to a proof of the prime number theorem. 

• The theta function 


e(z\r)^ e nin2r e 2ninz , 

n=—00 


which in fact is a function of the two complex variables z and r, 
holomorphic for all 2 :, but only for r in the half-plane Im(r) > 0. 
On the one hand, when we fix r, and think of ㊀ as a function of 
2 :, it is closely related to the theory of elliptic (doubly-periodic) 
functions. On the other hand, when z is fixed, ㊀ displays features 
of a modular function in the upper half-plane. The function Q(z\r) 
arose in Book I as a fundamental solution of the heat equation on 
the circle. It will be used again in the study of the zeta function, as 
well as in the proof of certain results in combinatorics and number 
theory given in Chapters 6 and 10. 

Two additional noteworthy topics that we treat are: the Fourier trans¬ 
form with its elegant connection to complex analysis via contour integra¬ 
tion, and the resulting applications of the Poisson summation formula; 
also conformal mappings, with the mappings of polygons whose inverses 
are realized by the Schwarz- Christ off el formula, and the particular case 
of the rectangle, which leads to elliptic integrals and elliptic functions. 



Preliminaries to Complex 
Analysis 


The sweeping development of mathematics during the 
last two centuries is due in large part to the introduc¬ 
tion of complex numbers; paradoxically, this is based 
on the seemingly absurd notion that there are num¬ 
bers whose squares are negative. 


E. Borel, 1952 


This chapter is devoted to the exposition of basic preliminary material 
which we use extensively throughout of this book. 

We begin with a quick review of the algebraic and analytic properties 
of complex numbers followed by some topological notions of sets in the 
complex plane. (See also the exercises at the end of Chapter 1 in Book I.) 

Then, we define precisely the key notion of holomorphicity, which is 
the complex analytic version of differentiability. This allows us to discuss 
the Cauchy-Riemann equations, and power series. 

Finally, we define the notion of a curve and the integral of a function 
along it. In particular, we shall prove an important result, which we state 
loosely as follows: if a function / has a primitive, in the sense that there 
exists a function F that is holomorphic and whose derivative is precisely 
/, then for any closed curve 7 



This is the first step towards Cauchy’s theorem, which plays a central 
role in complex function theory. 

1 Complex numbers and the complex plane 

Many of the facts covered in this section were already used in Book I. 

1.1 Basic properties 

A complex number takes the form z = x iy where x and y are real, 
and i is an imaginary number that satisfies i 2 = —1. We call x and y the 
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z = x + iy = (x, y) 
- -T 


real part and the imaginary part of 2 :, respectively, and we write 
x = Re(z) and y = Im(z). 

The real numbers are precisely those complex numbers with zero imagi¬ 
nary parts. A complex number with zero real part is said to be purely 
imaginary. 

Throughout our presentation, the set of all complex numbers is de¬ 
noted by C. The complex numbers can be visualized as the usual Eu¬ 
clidean plane by the following simple identification: the complex number 
z = x -\-iy E C is identified with the point (x, y) G M 2 . For example, 0 
corresponds to the origin and i corresponds to (0,1). Naturally, the x 
and y axis of M 2 are called the real axis and imaginary axis, because 
they correspond to the real and purely imaginary numbers, respectively. 
(See Figure 1.) 


0 1 X Real axis 


Figure 1. The complex plane 

The natural rules for adding and multiplying complex numbers can be 
obtained simply by treating all numbers as if they were real, and keeping 
in mind that i 2 = — 1. If z\ = x\-\- iyi and Z 2 = X 2 + 切 2 , then 

z\-\- Z 2 = (xi + x 2 ) + i(yi + 2 / 2 )， 

and also 

Z 1 Z 2 = (x 1 + iyi)(x 2 + iy 2 ) 

=xix 2 + ix^ + iyix 2 + i 2 yiV 2 
=(X 1 X 2 - 2 / 12 / 2 ) + i(xiV 2 + yix 2 ) - 
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If we take the two expressions above as the definitions of addition and 
multiplication, it is a simple matter to verify the following desirable 
properties: 

• Commutativity: zi + 之 2 = 么 2 + 之 1 and Z\Z^ = 之 2:1 for all Zi, 2:2 GC. 

• Associativity: (zi + z 2 ) z 3 = z 1 -\- (z 2 + z 3 )； and {ziz 2 )zs = 
Zi{z 2 z 3 ) for z 1 ,z 2 ,z 3 e C. 

• Distributivity: : 1 ( 之 2 + 之 3 ) = Z\Z^ + Zxz^, for all Z 2 , 2:3 G C. 

Of course, addition of complex numbers corresponds to addition of the 
corresponding vectors in the plane R 2 . Multiplication, however, consists 
of a rotation composed with a dilation, a fact that will become transpar¬ 
ent once we have introduced the polar form of a complex number. At 
present we observe that multiplication by i corresponds to a rotation by 
an angle of 7r/2. 

The notion of length, or absolute value of a complex number is identical 
to the notion of Euclidean length in M 2 . More precisely, we define the 
absolute value of a complex number z = x iy by 

so that 1 2 :1 is precisely the distance from the origin to the point (x^y). In 
particular, the triangle inequality holds: 

I 之 + 切 I 幺 | 之 | + M for all z,w E C. 

We record here other useful inequalities. For all z G C we have both 
|Re(z)| < | 2 ：| and |Im(z)| < | 2 ：|, and for all z,w E C 

Ikl - kll < \z-w\. 

This follows from the triangle inequality since 

| 2 ：| < \z — w\-\ - |^| and \w\ <\z — w\-\ - 12 ：|. 

The complex conjugate oi z = x -\- iy \s defined by 

"z = x — iy, 

and it is obtained by a reflection across the real axis in the plane. In 
fact a complex number 2 ; is real if and only if z = z, and it is purely 
imaginary if and only ii z = —z. 
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The reader should have no difficulty checking that 

z-z 


Re(:) = and 

Also, one has 

\z\ 2 = zz and as a consequence 


Im(z) 




z 之 


2 i 


whenever z ^ 0. 


Any non-zero complex number ^ can be written in polar form 


where r > 0; also 0 G IR is called the argument of ^ (defined uniquely 
up to a multiple of 2 丌 ) and is often denoted by arg 么 , and 

e 10 = cos 0 + i sin 0 . 

Since \e l6 \ = 1 we observe that r = | 之 |, and 6 is simply the angle (with 
positive counterclockwise orientation) between the positive real axis and 
the half-line starting at the origin and passing through 2：. (See Figure 2.) 



Figure 2. The polar form of a complex number 


Finally, note that if z = re l6 and w = se lLp ， then 

zw = rse^ eJrip \ 


so multiplication by a complex number corresponds to a homothety in 
M 2 (that is, a rotation composed with a dilation). 
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1.2 Convergence 

We make a transition from the arithmetic and geometric properties of 
complex numbers described above to the key notions of convergence and 
limits. 

A sequence {zi,Z 2 ,...} of complex numbers is said to converge to 

w G C if 


lim \z n — w\ = 0, and we write w = lim z n . 

n—>oo n—>oo 

This notion of convergence is not new. Indeed, since absolute values in 
C and Euclidean distances in M 2 coincide, we see that z n converges to w 
if and only if the corresponding sequence of points in the complex plane 
converges to the point that corresponds to w. 

As an exercise, the reader can check that the sequence {z n } converges 
to w if and only if the sequence of real and imaginary parts of z n converge 
to the real and imaginary parts of respectively. 

Since it is sometimes not possible to readily identify the limit of a 
sequence (for example, lim^v— >-oo l/^ 3 ), it is convenient to have a 

condition on the sequence itself which is equivalent to its convergence. A 
sequence {z n } is said to be a Cauchy sequence (or simply Cauchy) if 

\z n — ^ 7 n\ —^ 0 as n, m —>• oo. 

In other words, given e > 0 there exists an integer N > Q so that 
\z n — Zm\ < ^ whenever n，m > N. An important fact of real analysis 
is that IR is complete: every Cauchy sequence of real numbers converges 
to a real number. 1 Since the sequence {z n } is Cauchy if and only if the 
sequences of real and imaginary parts of z n are, we conclude that every 
Cauchy sequence in C has a limit in C. We have thus the following result. 

Theorem 1.1 C ，the complex numbers, is complete. 

We now turn our attention to some simple topological considerations 
that are necessary in our study of functions. Here again, the reader will 
note that no new notions are introduced, but rather previous notions are 
now presented in terms of a new vocabulary. 

1.3 Sets in the complex plane 

If 2：o G C and r > 0, we define the open disc D r (zo) of radius r cen¬ 
tered at zq to be the set of all complex numbers that are at absolute 


1 This is sometimes called the Bolzano-Weierstrass theorem. 
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value strictly less than r from zq. In other words, 

D r (zo) = {z e C : \z - z 0 \ < r} ? 

and this is precisely the usual disc in the plane of radius r centered at 
zq. The closed disc D r (zo) of radius r centered at zo is defined by 

D r (z 0 ) = {z eC : \z - z 0 \ < r}, 

and the boundary of either the open or closed disc is the circle 

C r (z 0 ) = {z e C : \z — zq\ = r}. 

Since the unit disc (that is, the open disc centered at the origin and of 
radius 1) plays an important role in later chapters, we will often denote 
it by D, 

D = {z G C : | 之 | < 1}. 

Given a set f] C C, a point zq is an interior point of if there exists 
r > 0 such that 


D r (z 0 ) C 

The interior of Cl consists of all its interior points. Finally, a set Cl is 
open if every point in that set is an interior point of $1. This definition 
coincides precisely with the definition of an open set in R 2 . 

A set Cl is closed if its complement = C — is open. This property 
can be reformulated in terms of limit points. A point 2 ; G C is said to 
be a limit point of the set f] if there exists a sequence of points z n ^ Cl 
such that z n _ z and lim^-.oo z n = z. The reader can now check that a 
set is closed if and only if it contains all its limit points. The closure of 
any set f] is the union of Q, and its limit points, and is often denoted by 

n. 

Finally, the boundary of a set Q is equal to its closure minus its 
interior, and is often denoted by dft. 

A set f] is bounded if there exists M > 0 such that | 之 | < M whenever 
2 : G f2. In other words, the set f] is contained in some large disc. If is 
bounded, we define its diameter by 


diam(fi) = sup — w\. 

z,w^Q 

A set is said to be compact if it is closed and bounded. Arguing 
as in the case of real variables, one can prove the following. 
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Theorem 1.2 The set f] C C zs compact if and only if every sequence 
C has a subsequence that converges to a point in f]. 

An open covering of f] is a family of open sets {U a } (not necessarily 
countable) such that 

^ C Ijc/ Q . 

a 

In analogy with the situation in M, we have the following equivalent 
formulation of compactness. 

Theorem 1.3 A set f] is compact if and only if every open covering of 
f] has a finite subcovering. 

Another interesting property of compactness is that of nested sets. 
We shall in fact use this result at the very beginning of our study of 
complex function theory, more precisely in the proof of Goursat’s theorem 
in Chapter 2. 

Proposition 1.4 If^li D D 2 D .. _ D D .. • is a sequence of non-empty 
compact sets in C with the property that 

diam(f] n ) —>0 as n ^ 00 , 

then there exists a unique point w E ： C such that w G f2 n for all n. 

Proof. Choose a point z n in each Q n . The condition diam(f] n ) —^ 0 
says precisely that {zn} is a Cauchy sequence, therefore this sequence 
converges to a limit that we call w. Since each set f2 n is compact we 
must have w G f2 n for all n. Finally, w is the unique point satisfying this 
property, for otherwise, if w' satisfied the same property with w’ _ w 
we would have \w — > 0 and the condition diam(f] n ) —> 0 would be 

violated. 

The last notion we need is that of connectedness. An open set f] C C is 
said to be connected if it is not possible to find two disjoint non-empty 
open sets f 2 i and such that 

U 0 ^ 2 - 

A connected open set in C will be called a region. Similarly, a closed 
set F is connected if one cannot write F = F\\J F 2 where F\ and F 2 are 
disjoint non-empty closed sets. 

There is an equivalent definition of connectedness for open sets in terms 
of curves, which is often useful in practice: an open set Q is connected 
if and only if any two points in f] can be joined by a curve 7 entirely 
contained in f]. See Exercise 5 for more details. 


8 Chapter 1. PRELIMINARIES TO COMPLEX ANALYSIS 

2 Functions on the complex plane 

2.1 Continuous functions 

Let / be a function defined on a set f] of complex numbers. We say that 
/is continuous at the point 2：o G if for every e > 0 there exists 5 > 0 
such that whenever z E Q and \z — Zo\ < S then |/(z) — /(2：o)| < e. An 
equivalent definition is that for every sequence {z±, 2:2,...} C such that 
lini2： n = z 0 , then lim f(z n ) = f(z 0 ). 

The function / is said to be continuous on if it is continuous at 
every point of Sums and products of continuous functions are also 
continuous. 

Since the notions of convergence for complex numbers and points in 
M 2 are the same, the function / of the complex argument z = x iy is 
continuous if and only if it is continuous viewed as a function of the two 
real variables x and y. 

By the triangle inequality, it is immediate that if / is continuous, then 
the real-valued function defined by z 1—> \f(z)\ is continuous. We say that 
f attains a maximum at the point 2：o G 0 if 

\f{z)\ < \f{z 0 )\ for all z en, 

with the inequality reversed for the definition of a minimum. 

Theorem 2.1 A continuous function on a compact set f] is bounded and 
attains a maximum and minimum on 

This is of course analogous to the situation of functions of a real vari¬ 
able, and we shall not repeat the simple proof here. 


2.2 Holomorphic functions 


We now present a notion that is central to complex analysis, and in 
distinction to our previous discussion we introduce a definition that is 
genuinely complex in nature. 

Let f] be an open set in C and / a complex-valued function on Cl. The 
function / is holomorphic at the point 2：o G if the quotient 


⑴ 


f(zp + h) - f{z 0 ) 
h 


converges to a limit when /i —^ 0 . Here /i G C and h ^ 0 with zo h E 
so that the quotient is well defined. The limit of the quotient, when it 
exists, is denoted by /’(2：o), and is called the derivative of / at zq ： 


/’Oo) 


lim 

h ― ^0 


f( z o ~\~ h) — f(zp) 
h 
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f( z o h) — f(zo) 
h 


which has no limit as ft 0, as one can see by first taking h real and 
then h purely imaginary. 


It should be emphasized that in the above limit, /i is a complex number 
that may approach 0 from any direction. 

The function / is said to be holomorphic on f] if / is holomorphic 
at every point of f]. If C is a closed subset of C, we say that / is 
holomorphic on C if / is holomorphic in some open set containing C. 
Finally, if / is holomorphic in all of C we say that / is entire. 

Sometimes the terms regular or complex differentiable are used in¬ 
stead of holomorphic. The latter is natural in view of (1) which mimics 
the usual definition of the derivative of a function of one real variable. 
But despite this resemblance, a holomorphic function of one complex 
variable will satisfy much stronger properties than a differentiable func¬ 
tion of one real variable. For example, a holomorphic function will actu¬ 
ally be infinitely many times complex differentiable, that is, the existence 
of the first derivative will guarantee the existence of derivatives of any 
order. This is in contrast with functions of one real variable, since there 
are differentiable functions that do not have two derivatives. In fact more 
is true: every holomorphic function is analytic, in the sense that it has a 
power series expansion near every point (power series will be discussed 
in the next section), and for this reason we also use the term analytic 
as a synonym for holomorphic. Again, this is in contrast with the fact 
that there are indefinitely differentiable functions of one real variable 
that cannot be expanded in a power series. (See Exercise 23.) 

Example 1. The function f(z) = z is holomorphic on any open set in 
C, and f\z) = 1 . In fact, any polynomial 

p{^) = a。+ a\z + • _. + a n z n 
is holomorphic in the entire complex plane and 

p\z) = ai~\ - h na n z n_1 . 

This follows from Proposition 2.2 below. 

Example 2. The function 1/z\s holomorphic on any open set in C that 
does not contain the origin, and f(z) = —1/z 2 . 

Example 3. The function f(z) = z is not holomorphic. Indeed, we have 


-ft X 
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An important family of examples of holomorphic functions, which 
we discuss later, are the power series. They contain functions such as 
e 2 , sin 么 , or cos z, and in fact power series play a crucial role in the theory 
of holomorphic functions, as we already mentioned in the last paragraph. 
Some other examples of holomorphic functions that will make their ap¬ 
pearance in later chapters were given in the introduction to this book. 

It is clear from ( 1 ) above that a function / is holomorphic at 2：o G O 
if and only if there exists a complex number a such that 

(2) f(z 0 + h)-f[z Q )-ah = h/ip(h), 

where 0 is a function defined for all small h and lim^o "^(h) = 0 . Of 
course, we have a = f’(ZQ). From this formulation, it is clear that / is 
continuous wherever it is holomorphic. Arguing as in the case of one real 
variable, using formulation (2) in the case of the chain rule (for exam¬ 
ple), one proves easily the following desirable properties of holomorphic 
functions. 

Proposition 2.2 If f and g are holomorphic in Q, then: 

(i) f g is holomorphic in and (/ + g) r = /’+〆. 

(ii) fg is holomorphic in and {fg) r = f r g + fg’. 

(iii) If g(zo) ^ 0 , then f /g is holomorphic at zq and 

fg - fg f 

9 2 • 

Moreover, if f .. Q — U and g : U C are holomorphic, the chain rule 
holds 

(g° f)'(z)= g'(f(z))f'(z) for all zeQ. 


Complex-valued functions as mappings 

We now clarify the relationship between the complex and real deriva¬ 
tives. In fact, the third example above should convince the reader that 
the notion of complex differentiability differs significantly from the usual 
notion of real differentiability of a function of two real variables. Indeed, 
in terms of real variables, the function f(z) = ~z corresponds to the map 
F : (x, y) i—>• (x, —y), which is differentiable in the real sense. Its deriva¬ 
tive at a point is the linear map given by its Jacobian, the 2x2 matrix 
of partial derivatives of the coordinate functions. In fact, F is linear and 
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is therefore equal to its derivative at every point. This implies that F is 
actually indefinitely differentiable. In particular the existence of the real 
derivative need not guarantee that / is holomorphic. 

This example leads us to associate more generally to each complex¬ 
valued function f = u-\- iv the mapping F(x,y) = (u(x, y),v(x, y)) from 
M 2 to M 2 . 

Recall that a function F(x,y) = (u(x,y),v(x,y)) is said to be differ¬ 
entiable at a point Po = (a ： o, yo) if there exists a linear transformation 
J : M 2 —>• M 2 such that 

(3) l^o + ^)-^o)-^)U 0 as| 丑卜 0 ，丑 £ 舻 . 

Equivalently, we can write 

F(P 0 + H)~ F(P 0 ) = J(H) + \H\^(H), 


with I 屯 (iJ)| — 0 as |iJ| —> 0. The linear transformation J is unique and 
is called the derivative of F at Po- If F is differentiable, the partial 
derivatives of u and v exist, and the linear transformation J is described 
in the standard basis of R 2 by the Jacobian matrix of F 


J = Jf (工 ， y) 



du/dx 

dv/dx 


du/dy 

dv/dy 


In the case of complex differentiation the derivative is a complex number 
/’( 之 0 ), while in the case of real derivatives, it is a matrix. There is, 
however, a connection between these two notions, which is given in terms 
of special relations that are satisfied by the entries of the Jacobian matrix, 
that is, the partials of u and v. To find these relations, consider the limit 
in (1) when h is first real, say h = h\ + ih 2 with = 0. Then, if we 
write z = x iy, Zq = xq iyo, and f(z) = /(x, y), we find that 


/’(2o) 


lim 

h \ — ^0 


/Oo + h u y 0 ) — f(x 0 ,yo) 


hi 


d£ 

dx 


㈤ ， 


where d/dx denotes the usual partial derivative in the x variable. (We fix 
yo and think of / as a complex-valued function of the single real variable 
x.) Now taking h purely imaginary, say h = i/i 2 , a similar argument 
yields 
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/，㈤ 


lim 

/i 2 — 

is / 

i dy 


f(x 0 ,Vo + h 2 ) - f(x 0 ,y 0 ) 


ih 2 


㈤ ， 


where d/dy is partial differentiation in the y variable. Therefore, if / is 
holomorphic we have shown that 


n 

dx i dy 


Writing f = u iv, we find after separating real and imaginary parts 
and using 1/i = —z, that the partials of u and v exist, and they satisfy 
the following non-trivial relations 

du dv du dv 

_ _ _ O -p /-j - - - 

dx dy dy dx 

These are the Cauchy-Riemann equations, which link real and complex 
analysis. 

We can clarify the situation further by defining two differential oper¬ 
ators 


d__l f d_ ld_\ _9_ _ 1 ( d_ _ ld_\ 

dz 2 \dx i dy) an &z 2 i dy) 


Proposition 2.3 If f is holomorphic at zq, then 


d£ 

dz 


( 卻 ） = 0 and f f (z 0 ) 


d£ 

dz 


㈤ 


^du . 、 
2 石 ㈤. 


Also, if we write F(x,y) = f(z)，then F is differentiable in the sense of 
real variables, and 


detJ F (x 0 ,yo) = \f'(z 0 )\ 2 . 

Proof. Taking real and imaginary parts, it is easy to see that the 
Cauchy-Riemann equations are equivalent to df /dz = 0. Moreover, by 
our earlier observation 


/’(:o) 


2 \ 
d£ 

dz 


% {zo) + \% {zo \ 

㈤ ， 
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and the Cauchy-Riemann equations give df /dz = 2du/dz. To prove 
that F is differentiable it suffices to observe that \i H = (/ii," 2 ) and 
h = h\-\- z/i 2 , then the Cauchy-Riemann equations imply 


JF(x 0 ,yo)(H) 



(hi + ih 2 ) = f(z Q )h, 


where we have identified a complex number with the pair of real and 
imaginary parts. After a final application of the Cauchy-Riemann equa¬ 
tions, the above results imply that 
⑷ 

dudv dv du f du\ 2 du\ 2 
dx dy dx dy / 


detJ F (x 0 ,yo) 


4 

oz 




So far, we have assumed that / is holomorphic and deduced relations 
satisfied by its real and imaginary parts. The next theorem contains an 
important converse, which completes the circle of ideas presented here. 


Theorem 2.4 Suppose f = u iv is a complex-valued function defined 
on an open set If u and v are continuously differentiable and satisfy 
the Cauchy-Riemann equations on f], then f is holomorphic on f] and 
f(z) = df/dz. 


Proof. Write 


u(x + h 1 ,y-\-h 2 ) - u(x,y) 


du 1 du 1 l7 . . /7 x 


and 

dv dv 

v(xhi,yh 2 ) - v{x,y) = —/ii + —h 2 + \h\^ 2 {h), 

ox oy 

where ^j(h) —> 0 (for j = 1,2) as \h\ tends to 0, and h = hi~\~ ih 2 . Using 
the Cauchy-Riemann equations we find that 

f(z-\-h) — f(z)= ^ 石— (& 1 + 认 2 ) + I"I0 (")， 

where ^(h) = ^i{h) + ^{h) —> 0, as \h\ —> 0. Therefore / is holomorphic 
and 
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2.3 Power series 

The prime example of a power series is the complex exponential func¬ 
tion, which is defined for z G C by 

e2 = S^[. 

n=0 

When 2 ; is real, this definition coincides with the usual exponential func¬ 
tion, and in fact, the series above converges absolutely for every z E C. 
To see this, note that 




so \e z \ can be compared to the series \A n / n ^ = < oo. In fact, this 

estimate shows that the series defining e z is uniformly convergent in every 
disc in C. 

In this section we will prove that e z is holomorphic in all of C (it is 
entire), and that its derivative can be found by differentiating the series 
term by term. Hence 


㈣ , =E 


v n—1 


n- 


n=0 


n\ 


E 

m=0 


ml 


and therefore e z is its own derivative. 

In contrast, the geometric series 

oo 

n=0 

converges absolutely only in the disc |z| < 1, and its sum there is the 
function 1/(1 — z), which is holomorphic in the open set C — {1}. This 
identity is proved exactly as when 2 ： is real: we first observe 


N 

E: 

n=0 


yN+1 


1-Z 


and then note that if \z\ < 1 we must have limTv^oo z NJrl = 0. 
In general, a power series is an expansion of the form 


(5) 


E 

n=0 


a n z 
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where a n G C. To test for absolute convergence of this series, we must 
investigate 

oo 

Ewif ， 

n=0 

and we observe that if the series (5) converges absolutely for some zq, 
then it will also converge for all 么 in the disc \z\ < \zo\. We now prove 
that there always exists an open disc (possibly empty) on which the 
power series converges absolutely. 

Theorem 2.5 Given a power series a n z n , there exists 0 < i? < oo 

such that: 

(i) If\z\ < R the series converges absolutely. 

(ii) If\z\ > R the series diverges. 

Moreover, if we use the convention that 1/0 = oo and l/oo = 0 ， then R 
is given by Hadamard’s formula 

1/R = limsup |a n | 1//n . 

The number R is called the radius of convergence of the power series, 
and the region |z| < R the disc of convergence. In particular, we 
have i? = oo in the case of the exponential function, and i? = 1 for the 
geometric series. 

Proof. Let L = 1/R where R is defined by the formula in the state¬ 
ment of the theorem, and suppose that L ^ 0, oo. (These two easy cases 
are left as an exercise.) If |z| < i?, choose e > 0 so small that 

(L + e)\z\ = r < 1. 

By the definition L, we have |a n | 1//n < L + e for all large n, therefore 
\a n \\z\ n <{(L + e)\z\} n ^r n . 

Comparison with the geometric series r n shows that ^ a n z n con¬ 
verges. 

If |^| > i?, then a similar argument proves that there exists a sequence 
of terms in the series whose absolute value goes to infinity, hence the 
series diverges. 

Remark. On the boundary of the disc of convergence, |z| = i?, the sit¬ 
uation is more delicate as one can have either convergence or divergence. 
(See Exercise 19.) 
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Further examples of power series that converge in the whole complex 
plane are given by the standard trigonometric functions; these are 
defined by 


cos z = 


oo 

E(-!) 

71=0 


z 2n 

(2n)! 


and 


sin ^ = 


oo 

E(-!) 

n=0 


^2n+l 

(2n+ 1)! 


and they agree with the usual cosine and sine of a real argument whenever 
2： G M. A simple calculation exhibits a connection between these two 
functions and the complex exponential, namely, 

_|_ e ~iz e iz _ e ~iz 

cos 2： = --- and sin z = -—- . 

2 2i 

These are called the Euler formulas for the cosine and sine functions. 

Power series provide a very important class of analytic functions that 
are particularly simple to manipulate. 

Theorem 2.6 The power series f[z) = a n zU defines a holomor- 

phic function in its disc of convergence. The derivative of f is also a 
power series obtained by differentiating term by term the series for f, 
that is, 


oo 

f(z) = 

n=0 

Moreover, f f has the same radius of convergence as f. 

Proof. The assertion about the radius of convergence of j' follows 
from Hadamard’s formula. Indeed, lim n _ >00 n 1 ^ 71 = 1, and therefore 

limsup la n l" n = limsup Inc^l 1 / 71 , 

so that ^2 a n z n and ^ na n z n have the same radius of convergence, and 
hence so do ^2a n z n and na 7l z n ~ 1 . 

To prove the first assertion, we must show that the series 

oo 

= 〉 ： Tid n Z n 
n=0 

gives the derivative of /. For that, let R denote the radius of convergence 
of /, and suppose \zq\ < r < R. Write 


/(z) = S N (z) + E n (z) 
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N oo 

Sn{^) = a n z n and En(z) = 

n=0 n=N-\-l 

Then, if h is chosen so that \zq -\- h\ < r we have 


f( z o + ") _ /(_£o) _ g 卜 )—( h) — Sn{zq) _ s ， (之 0 )、 


h 


h 
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+ (^n( z o) — d( z o)) + 


En(ZQ + ") — En(Zq) 
h 


Since a n — b n = (a — 6 )(a n_1 + a n ~ 2 b + • •. + ab n ~ 2 + 6 n_1 ), we see that 


En(^o ~\~h) — En(zq) 


h 


^ E I 

n=N-\-l 


(z 0 + h) n — zS 


^ E I 

n=N-\-l 


a n \nr 


n—1 


where we have used the fact that \zq\ < r and | 之 o + 叫 < The expres¬ 
sion on the right is the tail end of a convergent series, since g converges 
absolutely on |z| < R. Therefore, given e > 0 we can find Ni so that 
N > Ni implies 

En{^o ~\~ h) — En(zq) 〆 
h 

Also, since limjv-.oo ^(^o) = ^(^o)? we can find N 2 so that N > N 2 
implies 

\S' N (z 0 ) -9(2o)| < e- 


If we fix N so that both N > Ni and N > N 2 hold, then we can find 
5 > 0 so that \h\ < 5 implies 


Sn{zq h) — Sn{^o) 


h 


^n( z o) 


< e _ 


simply because the derivative of a polynomial is obtained by differenti¬ 
ating it term by term. Therefore, 


f(zo + h) - f(z 0 ) 


h 


咖 0 ) 


<3e 


whenever \h\ <5, thereby concluding the proof of the theorem. 
Successive applications of this theorem yield the following. 
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Corollary 2.7 A power series is infinitely complex differentiable in its 
disc of convergence，and the higher derivatives are also power series ob¬ 
tained by termwise differentiation. 

We have so far dealt only with power series centered at the origin. 
More generally, a power series centered at 2 ：o G C is an expression of the 
form 

oo 

f(z) - z 0 ) n . 

n=0 

The disc of convergence of / is now centered at Zo and its radius is still 
given by Hadamard’s formula. In fact, if 

oo 

d { z ) = 〉 ： a n z n ， 

n=0 

then / is simply obtained by translating namely f(z)= g(w) where 
w = z — zq. As a consequence everything about g also holds for / after 
we make the appropriate translation. In particular, by the chain rule, 

oo 

f(z ) 二 g’(ui) = Lna„0 - z 0 ) n_1 . 

n=0 

A function / defined on an open set Cl is said to be analytic (or have 
a power series expansion) at a point 2 ：o G if there exists a power 
series a n (z — Zo) n centered at zo, with positive radius of convergence, 
such that 


n (z — Zo) n for all 2 ： in a neighborhood of zq. 

n=0 

If / has a power series expansion at every point in f], we say that / is 

analytic on f]. 

By Theorem 2.6, an analytic function on f] is also holomorphic there. 
A deep theorem which we prove in the next chapter says that the converse 
is true: every holomorphic function is analytic. For that reason, we use 
the terms holomorphic and analytic interchangeably. 

3 Integration along curves 

In the definition of a curve, we distinguish between the one-dimensional 
geometric object in the plane (endowed with an orientation), and its 
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p ar amet r iz at ion, which is a mapping from a closed interval to C, that is 
not uniquely determined. 

A parametrized curve is a function z{t) which maps a closed interval 
[a, b] C M to the complex plane. We shall impose regularity conditions 
on the parametrization which are always verified in the situations that 
occur in this book. We say that the parametrized curve is smooth if 
z r (t) exists and is continuous on [a, 6], and z r (t) ^ 0 for t G [a, b]. At the 
points t = a and t = b, the quantities z\a) and z^b) are interpreted as 
the one-sided limits 

之 >)= lim + and /(b) ^ lim <b+h)-z(b)_ 

h — 0 tl h ^ 0 h 

h > 0 h < 0 

In general, these quantities are called the right-hand derivative of z(t) at 
a, and the left-hand derivative of z(t) at 6, respectively. 

Similarly we say that the parametrized curve is piecewise-smooth if 
z is continuous on [a, b] and if there exist points 

cl clq ^ b ^ 

where z{t) is smooth in the intervals [afc,afc+i]. In particular, the right- 
hand derivative at a/c may differ from the left-hand derivative at ak for 
k = 1,..., n — 1. 

Two parametrizations, 

z •• [a, 6] —• C and z : [c, d] C, 

are equivalent if there exists a continuously differentiable bijection 
5 i— t(s) from [c, d] to [a, 6] so that ^(s) > 0 and 

5 (s) = z(t(s)). 

The condition t’(s) > 0 says precisely that the orientation is preserved: 
as 5 travels from c to d, then t(s) travels from a to b. The family of 
all parametrizations that are equivalent to z(t) determines a smooth 
curve 7 C C, namely the image of [a, b] under z with the orientation 
given by 2 ： as t travels from a to b. We can define a curve 7 _ obtained 
from the curve 7 by reversing the orientation (so that 7 and 7— consist 
of the same points in the plane). As a particular parametrization for 7— 
we can take 2 : - : [a, b] —>■ M 2 defined by 


z~{t) = z{b -\- a —t). 
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It is also clear how to define a piece wise-smooth curve. The points 
z(a) and z(b) are called the end-points of the curve and are independent 
on the parametrization. Since 7 carries an orientation, it is natural to 
say that 7 begins at z(a) and ends at z(b). 

A smooth or piecewise-smooth curve is closed if z(a) = z(b) for any 
of its parametrizations. Finally, a smooth or piecewise-smooth curve is 
simple if it is not self-intersecting, that is, z{t) ^ z(s) unless s = t. Of 
course, if the curve is closed to begin with, then we say that it is simple 
whenever z(t) _ z(s) unless s = or s = a and t = b. 



For brevity, we shall call any piecewise-smooth curve a curve, since 
these will be the objects we shall be primarily concerned with. 

A basic example consists of a circle. Consider the circle C r (zo) centered 
at zq and of radius r, which by definition is the set 

C r (z 0 ) = {z e C : jz — zq\ = r}. 

The positive orientation (counterclockwise) is the one that is given by 
the standard parametrization 

z(t) = zo~\~ re lt , where t G [ 0 , 2 丌 ], 

while the negative orientation (clockwise) is given by 

z(t) = zo + re~ lt , where t G [0, 2 tt]. 

In the following chapters, we shall denote by C a general positively ori¬ 
ented circle. 

An important tool in the study of holomorphic functions is integration 
of functions along curves. Loosely speaking, a key theorem in complex 
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analysis says that if a function is holomorphic in the interior of a closed 
curve 7, then 



and we shall turn our attention to a version of this theorem (called 
Cauchy 5 s theorem) in the next chapter. Here we content ourselves with 
the necessary definitions and properties of the integral. 

Given a smooth curve 7 in C parametrized by 2 ： : [a, b] —^ C, and / a 
continuous function on 7, we define the integral of / along 7 by 



In order for this definition to be meaningful, we must show that the 
right-hand integral is independent of the parametrization chosen for 7. 
Say that z is an equivalent parametrization as above. Then the change 
of variables formula and the chain rule imply that 


pb /»a pa 

/ f(z(tY)z’(t、dt 二 I f(z(t(s)))z'(t(s))t'(s) ds m •/ f(z(s))z'(s) ds. 
J a J c J c 


This proves that the integral of / over 7 is well defined. 

If 7 is piecewise smooth, then the integral of / over 7 is simply the 
sum of the integrals of / over the smooth parts of 7, so if z{t) is a 
piecewise-smooth parametrization as before, then 



By definition, the length of the smooth curve 7 is 



Arguing as we just did, it is clear that this definition is also independent 
of the parametrization. Also, if 7 is only piecewise-smooth, then its 
length is the sum of the lengths of its smooth parts. 

Proposition 3.1 Integration of continuous functions over curves satis¬ 
fies the following properties: 
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(i) It is linear, that is, if a^f 3 E C 7 then 



(af(z)+( 3 g(z)) dz 



f{z) dz + f 3 



g(z) dz. 


(ii) //7~ is 7 with the reverse orientation, then 



f(z) dz 



f(z) dz. 


(iii) One has the inequality 



f(z) dz 


< sup \f(z)\ .length(7)_ 
ze^y 


Proof. The first property follows from the definition and the linearity 
of the Riemann integral. The second property is left as an exercise. For 
the third, note that 



f(z)dz 


< sup 
te[a,b] 


I 解 ))1 



\z'(t)\dt < sup \f(z)\ - length(7) 

2G7 


as was to be shown. 


As we have said, Cauchy’s theorem states that for appropriate closed 
curves 7 in an open set O on which / is holomorphic, then 



f(z) dz = 0. 


The existence of primitives gives a first manifestation of this phenomenon. 
Suppose / is a function on the open set f]. A primitive for / on O is a 
function F that is holomorphic on f] and such that F r {z) = f(z) for all 
z eft. 


Theorem 3.2 If a continuous function f has a primitive F in Q, and 
^ is a curve in Q that begins at w\ and ends at W2, then 


f(z) dz = F(w 2 ) - F(wi). 
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Proof. If 7 is smooth, the proof is a simple application of the chain 
rule and the fundamental theorem of calculus. Indeed, if z(t) : [a, 6] —>• C 
is a parametrization for 7, then z(a) = w\ and z(b) = W2, and we have 


/㈤ dz= I f(z(t))z’(t)dt 
F f (z(t))z\t) dt 

^ F(z(b)) - F(z(a)). 

If 7 is only piecewise-smooth, then arguing as we just did, we obtain 
a telescopic sum, and we have 





f(z)dz 


n—1 

='Yh F ( z ( a k+i)) - F(z(a k )) 
k =0 

- F(z(a n )) - F(z(a 0 )) 

^ F(z(b)) - F(z(a)). 


Corollary 3.3 If 飞 is a closed curve in an open set Q, and f is contin¬ 
uous and has a primitive in then 



f(z) dz = 0. 


This is immediate since the end-points of a closed curve coincide. 

For example, the function f(z) = 1 /z does not have a primitive in the 
open set C — { 0 }, since if C is the unit circle parametrized by z(t) = e lt , 
0 <t < 27 r, we have 



f(z) dz 



2ttz ^ 0. 


In subsequent chapters, we shall see that this innocent calculation, which 
provides an example of a function / and closed curve 7 for which f(z) dz ^ 
0, lies at the heart of the theory. 


Corollary 3.4 If f is holomorphic in a region f] and f r = 0 , then f is 
constant. 
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Proof. Fix a point wo G f2. It suffices to show that f(w) = f(wo) for 
all w 6 Cl. 

Since f] is connected, for any zi; G there exists a curve 7 which joins 
to 初 . Since / is clearly a primitive for /’, we have 



By assumption, f f = 0so the integral on the left is 0, and we conclude 
that f{w) = f(w 0 ) as desired. 

Remark on notation. When convenient, we follow the practice of using 
the notation f(z) = 0(g(z)) to mean that there is a constant C > 0 such 
that \f(z)\ < C\g(z)\ for 2 ： in a neighborhood of the point in question. 
In addition, we say f(z) = o(g(z)) when \ f(z)/g(z)\ 0. We also write 

/(z) 〜 g(z) to mean that f(z)/g(z) —> 1. 

4 Exercises 

1 . Describe geometrically the sets of points 2 ： in the complex plane defined by the 
following relations: 

(a) \z — zi\ = \z — Z2\ where 2:1,2:2 G C. 

(b) 1/z = ~z. 

(c) Re(2) = 3 . 

(d) Re(z) > c, (resp., > c) where c G M. 

(e) Re(az -\-b) > 0 where a, 6 G C. 

(f) |z| = Re(z) + 1 . 

(g) lm(z) = c with c G M. 

2 . Let 〈.，.〉 denote the usual inner product in R 2 . In other words, if Z = (xi, yi) 
and W = (a ： 2,2/2), then 


(Z, W) = X1X2 ~\~yry2. 


Similarly, we may define a Hermitian inner product (*, •) in C by 


(z, w) = zw. 
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The term Hermitian is used to describe the fact that (.,•) is not symmetric, but 
rather satisfies the relation 


(2, w) — (w, z) for all z,w G C. 

Show that 

{z,w} = ^[(z,w) + (w,z)] = Re(2 ： ,w;), 
where we use the usual identification z = x iy G C with (x, y) G M 2 . 

3. With uj = where s > 0 and G M, solve the equation = a; in C where 
n is a natural number. How many solutions are there? 

4. Show that it is impossible to define a total ordering on C. In other words, one 
cannot find a relation >- between complex numbers so that: 

(i) For any two complex numbers z,w, one and only one of the following is true: 
z y w, w y z or z = w. 

(ii) For all zi,Z2, 2:3 G C the relation z\ Z2 implies 21 + 2:3 卜之 2 + 2:3. 

(iii) Moreover, for all 2:1,2:2, Z3 € C with 之 3 — 0, then z\ >- Z2 implies 212:3 卜 ^2^3- 
[Hint: First check if i 卜 0 is possible.] 

5. A set Q is said to be pathwise connected if any two points in Q can be 
joined by a (piecewise-smooth) curve entirely contained in The purpose of this 
exercise is to prove that an open set Q is pathwise connected if and only if Q is 
connected. 

(a) Suppose first that Q is open and pathwise connected, and that it can be 
written as Q = Di U D2 where Di and Q2 are disjoint non-empty open sets. 
Choose two points wi € Oi and W2 G ^2 and let 7 denote a curve in Q 
joining w\ to W2. Consider a parametrization z : [0,1] ^ Q of this curve 
with 2:(0) = wi and z(l) = W2, and let 

t* = sup {t : z(s) € Qi for all 0 < s < t}. 

0<t<l 

Arrive at a contradiction by considering the point z(t*). 

(b) Conversely, suppose that ^ is open and connected. Fix a point w G and 
let C H denote the set of all points that can be joined to ty by a curve 
contained in Also, let ^2 C denote the set of all points that cannot be 
joined to if ； by a curve in O. Prove that both Qi and ^2 are open, disjoint 
and their union is Q. Finally, since is non-empty (why?) conclude that 
Q, = Q，i as desired. 
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The proof actually shows that the regularity and type of curves we used to define 
pathwise connectedness can be relaxed without changing the equivalence between 
the two definitions when Q is open. For instance, we may take all curves to be 
continuous, or simply polygonal lines. 2 

6. Let Q be an open set in C and z G ^ 1 . The connected component (or simply 
the component) of containing ^ is the set C z of all points w in Q that can be 
joined to 2 ： by a curve entirely contained in Q. 

(a) Check first that C z is open and connected. Then, show that w £ C z defines 
an equivalence relation, that is: (i) z £ C z , (ii) w E C z implies z G C w , and 
(iii) ii w £ C z and z 6 C^, then w 6 C^. 

Thus Q is the union of all its connected components, and two components 
are either disjoint or coincide. 

(b) Show that Q can have only countably many distinct connected components. 

(c) Prove that if Q is the complement of a compact set, then Q has only one 
unbounded component. 

[Hint: For (b), one would otherwise obtain an uncountable number of disjoint open 
balls. Now, each ball contains a point with rational coordinates. For (c), note that 
the complement of a large disc containing the compact set is connected.] 

7 . The family of mappings introduced here plays an important role in complex 
analysis. These mappings, sometimes called Blaschke factors, will reappear in 
various applications in later chapters. 

(a) Let 2, w be two complex numbers such that zw ^ 1 . Prove that 


w — z 


< 1 if \z\ < 1 and \w\ < 1, 


1 — wz 


and also that 


w — z 



1 — wz 


[Hint: Why can one assume that 2 ： is real? It then suffices to prove that 
(r — w)(r — w) < (1 — rw)(l — rw) 


with equality for appropriate r and |w;|.] 

(b) Prove that for a fixed w in the unit disc D, the mapping 


F : z 


w — z 


1 — wz 


satisfies the following conditions: 


2 A polygonal line is a piecewise-smooth curve which consists of finitely many straight 
line segments. 
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(i) F maps the unit disc to itself (that is, F : D ^ D), and is holomorphic. 

(ii) F interchanges 0 and w, namely F( 0 ) = w and F(w) = 0 . 

㈣ 剛 =1 叫 =1. 

(iv) F : D ^ D is bijective. [Hint: Calculate F o F.] 


8 . Suppose U and V are open sets in the complex plane. Prove that if / : C/ —>• V 
and g : V ^ C are two functions that are differentiable (in the real sense, that is, 
as functions of the two real variables x and y), and h = g o f, then 

dh _ dgdl dgdj 
dz dz dz ^ dz dz 


and 

dh _ dg_df_ dg_dj 
dz dz dz ^ &z dz' 

This is the complex version of the chain rule. 

9 . Show that in polar coordinates, the Cauchy-Riemann equations take the form 
du 1 dv . 1 du dv 

- = - Run - = - 

dr r d 6 r 89 dr 

Use these equations to show that the logarithm function defined by 

log 2 ： = log r -\- i 0 where 2 ： = re ld with —n < 6 < n 
is holomorphic in the region r > 0 and —tv < 0 < n. 


10 . Show that 


.d d 
4 石运 


4 基基 


A, 


where A is the Laplacian 


d 2 d 2 
dx 2 dy 2 


11 . Use Exercise 10 to prove that if / is holomorphic in the open set Q, then the 
real and imaginary parts of / are harmonic; that is, their Laplacian is zero. 

12 . Consider the function defined by 


f(x + iy) = VWM ， whenever x,y ER. 
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Show that / satisfies the Cauchy-Riemann equations at the origin, yet / is not 
holomorphic at 0. 

13 . Suppose that / is holomorphic in an open set Q. Prove that in any one of the 
following cases: 

(a) Re(/) is constant; 

(b) Im(/) is constant; 

(c) |/| is constant; 


one can conclude that / is constant. 


14 . Suppose {a n } n =i and {b n }n=i are two finite sequences of complex numbers. 
Let Bk = ^ 2 n=i denote the partial sums of the series b n with the convention 
Bo = 0. Prove the summation by parts formula 

N N-l 

〉 : CLnbn = (InBn — — 1 — 〉 : (^n+1 — dn)B n . 

n=M n=M 


15. Abel’s theorem. Suppose a n converges. Prove that 


lim 

r — ^1, r<l 




[Hint: Sum by parts.] In other words, if a series converges, then it is Abel summable 
with the same limit. For the precise definition of these terms, and more information 
on summability methods, we refer the reader to Book I, Chapter 2 . 


16 . Determine the radius of convergence of the series a nZ n when: 

(a) a n = (logn) 2 

(b) a n = n\ 

(c) a n = 4 n r !j_3 n 

(d) a n = (n!) 3 /( 3 n)! [Hint: Use Stirling’s formula, which says that 

n! 〜 cn n+ 2 e _n for some c > 0"] 

(e) Find the radius of convergence of the hypergeometric series 


F(a," ， 7;2) = 1 + [ 


a(a + 1) ■.. (a + n — V)f3(J3 + 1). • • + n _ 1) n 

77,17(7 + 1) • • • (7 + n — 1) 


Here a, G C and 7 _ 0 , — 1 , — 2 ,.... 
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(f) Find the radius of convergence of the Bessel function of order r: 


Jr(z)= 


(~Y^ (― 1 广 (i\ 2n 

\2 / nl(n + r)! \2 / 


where r is a positive integer. 


IT. Show that if {a n }^=o i s a sequence of non-zero complex numbers such that 

lim ^±ll=L, 

n—^oo \a n \ 

then 

lim jan! 1 / 71 = L. 

n — ^oo 

In particular, this exercise shows that when applicable, the ratio test can be used 
to calculate the radius of convergence of a power series. 

18 . Let / be a power series centered at the origin. Prove that / has a power series 
expansion around any point in its disc of convergence. 

[Hint: Write z = 2：。 + (z — 20) and use the binomial expansion for z n .] 

19 . Prove the following: 

(a) The power series nz n does not converge on any point of the unit circle. 

(b) The power series 2 ： n /n 2 converges at every point of the unit circle. 

(c) The power series ^2 z n /n converges at every point of the unit circle except 
z = 1 . [Hint: Sum by parts.] 


20 . Expand (1 — z)~ m in powers of 2：. Here m is a fixed positive integer. Also, 
show that if 

(l-z)~ m = J £a n z n , 

n=0 


then one obtains the following asymptotic relation for the coefficients: 


0>n 



as n ^ oo. 


21 . Show that for ㈤ < 1 , one has 


z z 

' + - —— 7 + ■■* + 


2 n 


1-Z 2 l-z 4 


l-z 2n 


1-z' 
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and 

z 20 2 2 k z 2k z 

- 1 - 1 - ... -| - = - 

1-hz 1 + z 2 l + z 2k 1-z' 

Justify any change in the order of summation. 

[Hint: Use the dyadic expansion of an integer and the fact that 2 fc+1 — 1 = 1 + 
2 + 2 2 + •••+ 2 fe .] 


22. Let N = {1,2, 3,...} denote the set of positive integers. A subset S' C N is 
said to be in arithmetic progression if 

S = {a, a + d, a + 2d, a + 3d,...} 

where a, d G N. Here d is called the step of 5*. 

Show that N cannot be partitioned into a finite number of subsets that are in 
arithmetic progression with distinct steps (except for the trivial case a = d = 1). 
[Hint: Write z n as a sum of terms of the type x z _ z a ■] 


23. Consider the function / defined on R by 



0 


if a: < 0 , 
if x > 0 . 


Prove that / is indefinitely differentiable on R, and that /( n )(0) = 0 for all n > 1. 
Conclude that / does not have a converging power series expansion a n x n 

for x near the origin. 


24. Let 7 be a smooth curve in C parametrized by z{t) : [a, b] —>• C. Let 7 — denote 
the curve with the same image as 7 but with the reverse orientation. Prove that 
for any continuous function / on 7 



f(z) dz 



25. The next three calculations provide some insight into Cauchy’s theorem, which 
we treat in the next chapter. 

(a) Evaluate the integrals 



for all integers n. Here 7 is any circle centered at the origin with the positive 
(counterclockwise) orientation. 


(b) Same question as before, but with 7 any circle not containing the origin. 
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(c) Show that if \a\ < r < | 6 |, then 


1 


(z — a) (z — b) 


dz - 


2ttz 
a _ b’ 


where 7 denotes the circle centered at the origin, of radius r, with the 
positive orientation. 


26. Suppose / is continuous in a region Q. Prove that any two primitives of / (if 
they exist) differ by a constant. 




Cauchy’s Theorem and Its 
Applications 


The solution of a large number of problems can be 
reduced, in the last analysis, to the evaluation of def¬ 
inite integrals; thus mathematicians have been much 
occupied with this task... However, among many re¬ 
sults obtained, a number were initially discovered by 
the aid of a type of induction based on the passage 
from real to imaginary. Often passage of this kind 
led directly to remarkable results. Nevertheless this 
part of the theory, as has been observed by Laplace, 
is subject to various difficulties... 

After having reflected on this subject and brought 
together various results mentioned above, I hope to 
establish the passage from the real to the imaginary 
based on a direct and rigorous analysis; my researches 
have thus led me to the method which is the object of 
this memoir... 

A. L. Cauchy, 1827 


In the previous chapter, we discussed several preliminary ideas in com¬ 
plex analysis: open sets in C, holomorphic functions, and integration 
along curves. The first remarkable result of the theory exhibits a deep 
connection between these notions. Loosely stated, Cauchy’s theorem 
says that if / is holomorphic in an open set f] and 7 C is a closed 
curve whose interior is also contained in then 

⑴ f f(z) dz = 0 . 

Many results that follow, and in particular the calculus of residues, are 
related in one way or another to this fact. 

A precise and general formulation of Cauchy 5 s theorem requires defin¬ 
ing unambiguously the “interior” of a curve, and this is not always an 
easy task. At this early stage of our study, we shall make use of the 
device of limiting ourselves to regions whose boundaries are curves that 
are “toy contours.” As the name suggests, these are closed curves whose 
visualization is so simple that the notion of their interior will be imam- 
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biguous, and the proof of Cauchy’s theorem in this setting will be quite 
direct. For many applications, it will suffice to restrict ourselves to these 
types of curves. At a later stage, we take up the questions related to 
more general curves, their interiors, and the extended form of Cauchy’s 
theorem. 

Our initial version of Cauchy’s theorem begins with the observation 
that it suffices that / have a primitive in f], by Corollary 3.3 in Chapter 1. 
The existence of such a primitive for toy contours will follow from a 
theorem of Goursat (which is itself a simple special case) 1 * * that asserts 
that if / is holomorphic in an open set that contains a triangle T and its 
interior, then 



It is noteworthy that this simple case of Cauchy’s theorem suffices to 
prove some of its more complicated versions. From there, we can prove 
the existence of primitives in the interior of some simple regions, and 
therefore prove Cauchy’s theorem in that setting. As a first application 
of this viewpoint, we evaluate several real integrals by using appropriate 
toy contours. 

The above ideas also lead us to a central result of this chapter, the 
Cauchy integral formula; this states that if / is holomorphic in an open 
set containing a circle C and its interior, then for all z inside ( 7 , 



Differentiation of this identity yields other integral formulas, and in 
particular we obtain the regularity of holomorphic functions. This is 
remarkable, since holomorphicity assumed only the existence of the first 
derivative, and yet we obtain as a consequence the existence of derivatives 
of all orders. (An analogous statement is decisively false in the case of 
real variables!) 

The theory developed up to that point already has a number of note¬ 
worthy consequences: 

• The property at the base of u analytic continuation,” namely that a 
holomorphic function is determined by its restriction to any open 
subset of its domain of definition. This is a consequence of the fact 
that holomorphic functions have power series expansions. 


1 Goursat 5 s result came after Cauchy’s theorem, and its interest is the technical fact 

that its proof requires only the existence of the complex derivative at each point, and not 

its continuity. For the earlier proof, see Exercise 5. 
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• Liouville’s theorem, which yields a quick proof of the fundamental 
theorem of algebra. 

• Morera’s theorem, which gives a simple integral characterization 
of holomorphic functions, and shows that these functions are pre¬ 
served under uniform limits. 


1 Goursafs theorem 


Corollary 3.3 in the previous chapter says that if / has a primitive in an 
open set fi, then 



dz^o 


for any closed curve 7 in f]. Conversely, if we can show that the above 
relation holds for some types of curves 7 , then a primitive will exist. Our 
starting point is Goursat’s theorem, from which in effect we shall deduce 
most of the other results in this chapter. 


Theorem 1.1 If Q is an open set in C ， and T C a triangle whose 
interior is also contained in then 



f(z) dz^o 


whenever f is holomorphic in f2. 

Proof. We call T( 0 ) our original triangle (with a fixed orientation 
which we choose to be positive), and let d ⑼ and p ⑼ denote the diame¬ 
ter and perimeter of T( 0 ), respectively. The first step in our construction 
consists of bisecting each side of the triangle and connecting the mid¬ 
points. This creates four new smaller triangles, denoted 
and that are similar to the original triangle. The construction and 
orientation of each triangle are illustrated in Figure 1. The orientation 
is chosen to be consistent with that of the original triangle, and so after 
cancellations arising from integrating over the same side in two opposite 
directions, we have 
( 2 ) 

[f(z)dz 二 [f{z)dz+ [ f(z)dz+ [ f{z)dz+ [ f{z) dz. 
Jt ⑼ Jt[ x) Jt 2 (1) jt 3 (1) jt^ 1) 


For some j we must have 






1. Goursafs theorem 


35 




for otherwise (2) would be contradicted. We choose a triangle that 
satisfies this inequality, and rename it Observe that if c^ 1 ) and 

denote the diameter and perimeter of T(” ， respectively, then = 
(l/2)d( 0 ) and = (l/2)p( 0 ). We now repeat this process for the trian¬ 
gle T(”，bisecting it into four smaller triangles. Continuing this process, 
we obtain a sequence of triangles 

了 (o), 了⑴， … ， T( n ) ，… 


with the properties that 



f(z) dz 


< 4 n 



f(z) dz 


and 


d {n) = 2 -n d ⑼， p ㈨ = 2~V 0) 


where d( n ) and p( n ) denote the diameter and perimeter of T( n ), respec¬ 
tively. We also denote by 丁⑻ the solid closed triangle with boundary 
T( n ), and observe that our construction yields a sequence of nested com¬ 
pact sets 

丁⑼〕 7^(!) 3 … 3 3 … 


whose diameter goes to 0. By Proposition 1.4 in Chapter 1, there exists 
a unique point zo that belongs to all the solid triangles T^ n \ Since / is 
holomorphic at zq we can write 

f(z) 二 f(z 0 ) + f'(z 0 )(z - z 0 ) + ip(z)(z - Zo) , 

where ^(z) 0 a,s z ^ Zq. Since the constant /(zq) and the linear func¬ 

tion f f (zo)(z — zo) have primitives, we can integrate the above equality 
using Corollary 3.3 in the previous chapter, and obtain 

(3) f f(z) dz= f z 0 ) dz. 

Jt( u ) JT( n ) 









36 Chapter 2. CAUCHY’S THEOREM AND ITS APPLICATIONS 

Now zo belongs to the closure of the solid triangle and 2 : to its 
boundary, so we must have |2 ： — zo| ^ d^ n \ and using (3) we get, by (iii) 
in Proposition 3.1 of the previous chapter, the estimate 



f(z) dz 


<e n d^p^ n \ 


where e n = sup z€T (n) |^( 2 ：)| —>• 0 as n —^ 00 . Therefore 


/r( n ) 


f(z)dz 


< e„C)p (0) 


which yields our final estimate 


/t(°) 


f(z)dz 


<4 n 


,T( n ) 


f{z) dz 


S 以 (V 0 ). 


Letting n —> 00 concludes the proof since e n 0. 


Corollary 1.2 If f is holomorphic in an open set Cl that contains a 
rectangle R and its interior, then 



f(z) dz = 0. 


This is immediate since we first choose an orientation as in Figure 2 
and note that 



f(z)dz = 



f(z) dz + 



f(z) dz. 



Figure 2. A rectangle as the union of two triangles 
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2 Local existence of primitives and Cauchy’s theorem in 
a disc 

We first prove the existence of primitives in a disc as a consequence of 
Goursafs theorem. 

Theorem 2.1 A holomorphic function in an open disc has a primitive 
in that disc. 

Proof. After a translation, we may assume without loss of generality 
that the disc, say D, is centered at the origin. Given a point z G D, 
consider the piecewise-smooth curve that joins 0 to z first by moving in 
the horizontal direction from 0 to 5 where z = Re( 2 ：), and then in the 
vertical direction from z to 2 ：. We choose the orientation from 0 to 之， 
and denote this polygonal line (which consists of at most two segments) 
by 7 ^, as shown on Figure 3. 



Figure 3. The polygonal line 


Define 


F ⑻二 



f(w) dw. 


The choice of gives an unambiguous definition of the function F(z). 
We contend that F is holomorphic in D and F’(z) = f(z). To prove this, 
fix 2 : G -D and let /i G C be so small that z-\- h also belongs to the disc. 
Now consider the difference 


F(z + ") - F(z)= 


/ f(w) dw — f(w) dw. 

'lz-\-h ^ 7z 






38 


Chapter 2. CAUCHY’S THEOREM AND ITS APPLICATIONS 


The function / is first integrated along ^f z +h with the original orientation, 
and then along 7 ^ with the reverse orientation (because of the minus 
sign in front of the second integral). This corresponds to (a) in Figure 4. 
Since we integrate / over the line segment starting at the origin in two 
opposite directions, it cancels, leaving us with the contour in (b). Then, 
we complete the square and triangle as shown in (c), so that after an 
application of Goursat’s theorem for triangles and rectangles we are left 
with the line segment from z to z h as given in (d). 



Figure 4. Relation between the polygonal lines and ^/ z +h 


Hence the above cancellations yield 


F(z + ") - F(z)= 



f (w) dw 


where 77 is the straight line segment from ztoz-\-h. Since / is continuous 
at 2 : we can write 


f( w ) = f( z ) + 寸 ( w ) 


where ^(w) —^ 0 as w ^ z. Therefore 

⑷ 

F(z + ") — F(z) = / f(z) dw ^){w) dw = f(z) dw ^(w) dw. 

J 77 J 77 J 77 J Tj 


On the one hand, the constant 1 has 切 as a primitive, so the first integral 
is simply h by an application of Theorem 3.2 in Chapter 1. On the other 
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hand, we have the following estimate: 



< sup |^(^)| 1^1- 

w^rj 


Since the supremum above goes to 0 as /i tends to 0, we conclude from 
equation (4) that 


lim 


F(z + ") - F(z) 
h 


=/W ， 


thereby proving that F is a primitive for / on the disc. 


This theorem says that locally, every holomorphic function has a prim¬ 
itive. It is crucial to realize, however, that the theorem is true not only 
for arbitrary discs, but also for other sets as well. We shall return to this 
point shortly in our discussion of “toy contours.，’ 

Theorem 2.2 (Cauchy’s theorem for a disc) If f is holomorphic in 
a disc，then 



f(z) dz = Q 


for any closed curve 7 in that disc. 


Proof. Since / has a primitive, we can apply Corollary 3.3 of Chap¬ 
ter 1. 


Corollary 2.3 Suppose f is holomorphic in an open set containing the 
circle C and its interior. Then 



f(z) dz = 0 . 


Proof. Let D be the disc with boundary circle C. Then there exists 
a slightly larger disc D’ which contains D and so that / is holomorphic 
on D, • We may now apply Cauchy’s theorem in D r to conclude that 

fc f( z ) dz = °- 

In fact, the proofs of the theorem and its corollary apply whenever we 
can define without ambiguity the “interior” of a contour, and construct 
appropriate polygonal paths in an open neighborhood of that contour 
and its interior. In the case of the circle, whose interior is the disc, there 
was no problem since the geometry of the disc made it simple to travel 
horizontally and vertically inside it. 
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The following definition is loosely stated, although its applications 
will be clear and unambiguous. We call a toy contour any closed curve 
where the notion of interior is obvious, and a construction similar to 
that in Theorem 2.1 is possible in a neighborhood of the curve and its 
interior. Its positive orientation is that for which the interior is to the left 
as we travel along the toy contour. This is consistent with the definition 
of the positive orientation of a circle. For example, circles, triangles, 
and rectangles are toy contours, since in each case we can modify (and 
actually copy) the argument given previously. 

Another important example of a toy contour is the “keyhole” r (illus¬ 
trated in Figure 5), which we shall put to use in the proof of the Cauchy 
integral formula. It consists of two almost complete circles, one large 



Figure 5. The keyhole contour 


and one small, connected by a narrow corridor. The interior of T, which 
we denote by r\ n t, is clearly that region enclosed by the curve, and can 
be given precise meaning with enough work. We fix a point zo in that 
interior. If / is holomorphic in a neighborhood of T and its interior, 
then it is holomorphic in the inside of a slightly larger keyhole, say A, 
whose interior Ai nt contains r U r\ nt . If 2 : G Ai nt , let 7 ^ denote any curve 
contained inside Ai nt connecting Zq to 2 ：, and which consists of finitely 
many horizontal or vertical segments (as in Figure 6 ). If rj z is any other 
such curve, the rectangle version of Goursafs theorem (Corollary 1.2) 
implies that 



and we may therefore define F unambiguously in Ai nt . 
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Figure 6. A curve 


Arguing as above allows us to show that is a primitive of / in Ai nt 
and therefore f r f(z) dz = 0 . 

The important point is that for a toy contour 7 we easily have that 



whenever / is holomorphic in an open set that contains the contour 7 
and its interior. 

Other examples of toy contours which we shall encounter in applica¬ 
tions and for which Cauchy 5 s theorem and its corollary also hold are 
given in Figure 7. 

While Cauchy’s theorem for toy contours is sufficient for most applica¬ 
tions we deal with, the question still remains as to what happens for more 
general curves. We take up this matter in Appendix B, where we prove 
Jordan’s theorem for piecewise-smooth curves. This theorem states that 
a simple closed piecewise-smooth curve has a well defined interior that 
is “simply connected.” As a consequence, we find that even in this more 
general situation, Cauchy’s theorem holds. 

3 Evaluation of some integrals 

Here we take up the idea that originally motivated Cauchy. We shall 
show by several examples how some integrals may be evaluated by the 
use of his theorem. A more systematic approach, in terms of the calculus 
of residues, may be found in the next chapter. 
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Sector Parallelogram 

Figure 7. Examples of toy contours 


Example 1. We show that if ^ G M, then 


( 5 ) 


T « 2 




e - 2nix ^ dx. 


This gives a new proof of the fact that e~ nx2 is its own Fourier transform, 
a fact we proved in Theorem 1.4 of Chapter 5 in Book I. 

If ^ = 0, the formula is precisely the known integral 2 

e _7rrr2 dx. 



Now suppose that ^ > 0, and consider the function f(z) = e _7r2：2 , which 
is entire, and in particular holomorphic in the interior of the toy contour 
7 丑 depicted in Figure 8. 


2 An alternate derivation follows from the fact that r(l/2) = y/n, where T is the gamma 
function in Chapter 6. 
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1 

—R + 

\ 

R + ii 





T? 0 

u 


Figure 8 . The contour 7 ^ in Example 1 


The contour 7 丑 consists of a rectangle with vertices i?, i? + —R + 
—R and the positive counterclockwise orientation. By Cauchy’s the¬ 
orem, 


⑹ 


f(z) dz = 0 . 


The integral over the real segment is simply 


r R 


e-^ dx. 


，一 R 


which converges to 1 as i? —^ 00 . The integral on the vertical side on the 
right is 


I(R) - f%{R + iy)idy 二 f e-< R2+2iRy ~ y2) idy. 

Jo Jo 

This integral goes to 0 as i? —> 00 since ^ is fixed and we may estimate 
it by 

\I{R)\ < Ce~ nR \ 


Similarly, the integral over the vertical segment on the left also goes to 0 
as i? —> 00 for the same reasons. Finally, the integral over the horizontal 
segment on top is 


R e ~^ x+i ^ 2 dx = -e ^ 2 [ R e -^ e ~ 2 ^ dx. 

J-R 


Therefore, we find in the limit as i? —^ 00 that ( 6 ) gives 


0 = 1 - e^ 2 / e ~^ 2 e -2nixS dx 







44 


Chapter 2. CAUCHY’S THEOREM AND ITS APPLICATIONS 


and our desired formula is established. In the case ^ < 0, we then consider 
the symmetric rectangle, in the lower half-plane. 

The technique of shifting the contour of integration, which was used 
in the previous example, has many other applications. Note that the 
original integral (5) is taken over the real line, which by an application 
of Cauchy’s theorem is then shifted upwards or downwards (depending 
on the sign of f) in the complex plane. 

Example 2. Another classical example is 



Here we consider the function f(z) = (1 — e lz )/z 2 ^ and we integrate over 
the indented semicircle in the upper half-plane positioned on the x-axis, 
as shown in Figure 9. 



—R 


Figure 9. The indented semicircle of Example 2 


If we denote by 7 ^" and 7 孟 the semicircles of radii e and R with negative 
and positive orientations respectively, Cauchy’s theorem gives 



75 


First we let i? —^ 00 and observe that 


so the integral over 7 孟 goes to zero. Therefore 
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Next, note that 


iz 


f( Z ) = ~~2~ + E(Z) 


where E(z) is bounded as z —^ 0, while on 7 ^" we have 2 ： = ee l6 and 
dz = iee l0 d6. Thus 


l-e l 


Jit ^ 

Taking real parts then yields 


dz — I (—n) d0 = —7 r as e —> 0. 


1 — cosx 


dx = 7T. 


Since the integrand is even, the desired formula is proved. 


4 Cauchy’s integral formulas 

Representation formulas, and in particular integral representation formu¬ 
las, play an important role in mathematics, since they allow us to recover 
a function on a large set from its behavior on a smaller set. For example, 
we saw in Book I that a solution of the steady-state heat equation in the 
disc was completely determined by its boundary values on the circle via 
a convolution with the Poisson kernel 

(7) u(r, e) = — J P r (e - (p)u(l,(f) d(f. 

In the case of holomorphic functions, the situation is analogous, which 
is not surprising since the real and imaginary parts of a holomorphic 
function are harmonic . 3 Here, we will prove an integral representation 
formula in a manner that is independent of the theory of harmonic func¬ 
tions. In fact, it is also possible to recover the Poisson integral formula (7) 
as a consequence of the next theorem (see Exercises 11 and 12). 

Theorem 4.1 Suppose f is holomorphic in an open set that contains 
the closure of a disc D. If C denotes the boundary circle of this disc with 
the positive orientation, then 

f(z) = / ⑹ for any point z 6 D. 

J c C- z 


3 This fact is an immediate consequence of the Cauchy-Riemann equations. We refer 
the reader to Exercise 11 in Chapter 1. 
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Proof. Fix z E D and consider the “keyhole” r^ e which omits the 
point 2 ： as shown in Figure 10. 



Here 5 is the width of the corridor, and e the radius of the small circle 
centered at z. Since the function F(Q = /(C)/(C — z ) is holomorphic 
away from the point ^ we have 



HO 成二 o 


by Cauchy’s theorem for the chosen toy contour. Now we make the 
corridor narrower by letting 6 tend to 0, and use the continuity of F to 
see that in the limit, the integrals over the two sides of the corridor cancel 
out. The remaining part consists of two curves, the large boundary circle 
C with the positive orientation, and a small circle C e centered at z of 
radius e and oriented negatively, that is, clockwise. To see what happens 
to the integral over the small circle we write 


( 8 ) 二 + m. 

C — z C — z 

and note that since / is holomorphic the first term on the right-hand 
side of (8) is bounded so that its integral over C e goes to 0 as e ^ 0. To 
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conclude the proof, it suffices to observe that 



二 -f(z)2m, 


so that in the limit we find 


0 = 



4 ^- dC - 2ttz/(z), 
^ ~ z 


as was to be shown. 

Remarks. Our earlier discussion of toy contours provides simple ex¬ 
tensions of the Cauchy integral formula; for instance, if / is holomorphic 
in an open set that contains a (positively oriented) rectangle R and its 
interior, then 




2tH 


/(C) 


d(.. 


whenever 2 ： belongs to the interior of R. To establish this result, it suffices 
to repeat the proof of Theorem 4.1 replacing the “circular” keyhole by a 
“rectangular” keyhole. 

It should also be noted that the above integral vanishes when 2 ： is 
outside i?, since in this case F(Q = /(C)/(C — z ) is holomorphic inside 
R. Of course, a similar result also holds for the circle or any other toy 
contour. 

As a corollary to the Cauchy integral formula, we arrive at a second 
remarkable fact about holomorphic functions, namely their regularity. 
We also obtain further integral formulas expressing the derivatives of / 
inside the disc in terms of the values of / on the boundary. 


Corollary 4.2 If f is holomorphic in an open set then f has infinitely 
many complex derivatives in f2. Moreover, if C C Q is a circle whose 
interior is also contained in f], then 


户 )( 和荔 


’c 


/(c) 

(C — 么 ) n+1 


dc 


for all z in the interior of C. 
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We recall that, as in the above theorem, we take the circle C to have 
positive orientation. 

Proof. The proof is by induction on n, the case n = 0 being simply 
the Cauchy integral formula. Suppose that / has up to n — 1 complex 
derivatives and that 




(n — 1)! 
2 ttz 


lc 


/(C) 

( c - 冲 


dC. 


Now for h small, the difference quotient for /( n_1 ) takes the form 


⑼ 


户-1)(么 + ") _/( n - l ) ⑻ 


(n — 1 )! 
2ni 



We now recall that 


1 1 

(C - ^ ~ (C - ^) n 


dC 


A n -B n = (A- B^A 71 - 1 + A n ~ 2 B + ... + AB n ~ 2 + S n_1 ]. 


With A — 1/(( — z — h) and B = 1/( ( — z\ we see that the term in 
brackets in equation (9) is equal to 


h 


(C ~ Z - h)(C - z) 


[ A n-1 + A n-2 B + ... + AB n ~ 2 + B 71 - 1 }. 


But observe that if h is small, then z -\- h and 2 ： stay at a finite distance 
from the boundary circle C, so in the limit as h tends to 0 , we find that 
the quotient converges to 


(計 i) ! 

2iri 



1 

n 


_( c - 中 - 1 」 


<K 二 


f /(C) 

Jc (C - z) n+l 


d(, 


which completes the induction argument and proves the theorem. 


From now on, we call the formulas of Theorem 4.1 and Corollary 4.2 

the Cauchy integral formulas. 

Corollary 4.3 (Cauchy inequalities) If f is holomorphic in an open 
set that contains the closure of a disc D centered at Zq and of radius R, 
then 

〈 ^\\f\\c 

- R n ’ 

where \\f\\c = su P z eC \ f( z )\ denotes the supremum of\f\ on the boundary 
circle C. 
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Proof. Applying the Cauchy integral formula for /( n ) (之 o), we obtain 


1/ ㈨㈤ I 


nl 


/(C) 




2ni J c (C - z 0 ) n+ 


< 


n\ 

/»27T 

2tt 

i 

n\ 

\f\\c 

2tt 

R n 


2tt. 


Another striking consequence of the Cauchy integral formula is its 
connection with power series. In Chapter 1, we proved that a power series 
is holomorphic in the interior of its disc of convergence, and promised a 
proof of a converse, which is the content of the next theorem. 

Theorem 4.4 Suppose f is holomorphic in an open set f]. If D is a 
disc centered at zq and whose closure is contained in fl, then f has a 
power series expansion at zq 


f(z) - z 0 ) n 

n=0 

for all z ^ D, and the coefficients are given by 

f {n) (zo) : „ \ n 

a n = - - - for all n > 0. 

nl 

Proof. Fix 2 ： G D. By the Cauchy integral formula, we have 


( 10 ) 


f(z) 


2ni 


/(C) 


dc： 


where C denotes the boundary of the disc and z e D. The idea is to 
write 


(ii) 



C ~ z C ~ z 0 — { z — ^o) C ~ z 0 ^ 


and use the geometric series expansion. Since ^ G C and z E D is fixed, 
there exists 0 < r < 1 such that 


Z — Zq 

C-^o 


< r, 
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therefore 



where the series converges uniformly for ^ G C. This allows us to inter¬ 
change the infinite sum with the integral when we combine (10), (11), 
and (12), thereby obtaining 



/(C) 


(C- z 0 ) n+1 


dC 


.( 2 ： — Zo) U - 


This proves the power series expansion; further the use of the Cauchy in¬ 
tegral formulas for the derivatives (or simply differentiation of the series) 
proves the formula for a n . 


Observe that since power series define indefinitely (complex) differ¬ 
entiable functions, the theorem gives another proof that a holomorphic 
function is automatically indefinitely differentiable. 

Another important observation is that the power series expansion of 
f centered at zq converges in any disc, no matter how large, as long 
as its closure is contained in In particular, if / is entire (that is, 
holomorphic on all of C), the theorem implies that / has a power series 
expansion around 0, say f(z) = a n z n , that converges in all of C. 

Corollary 4.5 (Liouville’s theorem) If f is entire and bounded, then 
f is constant. 

Proof. It suffices to prove that /’ = 0, since C is connected, and we 
may then apply Corollary 3.4 in Chapter 1. 

For each Zo G C and all R > 0, the Cauchy inequalities yield 

\f\z 0 )\ < I 

where B is a bound for /. Letting i? —> oo gives the desired result. 


As an application of our work so far, we can give an elegant proof of 
the fundamental theorem of algebra. 

Corollary 4.6 Every non-constant polynomial P(z) = a n z n + . — |- ao 
with complex coefficients has a root in C. 

Proof. If P has no roots, then 1/P(z) is a bounded holomorphic 
function. To see this, we can of course assume that a n ^ 0, and write 

¥ = 知 + (^i + …+ 咢 ) 

z n \ z z n / 
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whenever z ^ 0. Since each term in the parentheses goes to 0 as |^| —> oo 
we conclude that there exists i? > 0 so that if c = |a n |/ 2 , then 

\P(z)\ > c\z\ n whenever 1 2 ；I > i?. 

In particular, P is bounded from below when \z\ > R. Since P is contin¬ 
uous and has no roots in the disc |z| < i?, it is bounded from below in 
that disc as well, thereby proving our claim. 

By Liouville’s theorem we then conclude that 1/P is constant. This 
contradicts our assumption that P is non-constant and proves the corol¬ 
lary. 


Corollary 4.7 Every polynomial P(z) = a n z n + • • • + ao of degree n > 
1 has precisely n roots in C. If these roots are denoted by 忉 i,..., w n , 
then P can be factored as 

P(z) = a n (z - wi)(z - w 2 ) • •. (z _ w n ). 

Proof. By the previous result P has a root, say w\. Then, writing 
z = (z — wi) + wi, inserting this expression for 2 ： in P, and using the 
binomial formula we get 

_P(z) = b n (z — wi) n + ... + bi(z — wi) + 6o? 

where bo ,6 n _i are new coefficients, and b n = a n . Since P{w\) = 0, 
we find that bo = 0, therefore 

P{z) ^ (z- Wl) [bn(z - Wl) n_1 + ■ ■ ■ + &l] ^ (z- W!)Q(z), 

where Q is a polynomial of degree n — 1. By induction on the degree of 
the polynomial, we conclude that P(z) has n roots and can be expressed 
as 

P(z) = C(Z — W 1 )(z -W 2 )--'(z~ Wn) 

for some c G C. Expanding the right-hand side, we realize that the coef¬ 
ficient of z n is c and therefore c = a n as claimed. 

Finally, we end this section with a discussion of analytic continuation 
(the third of the “miracles” we mentioned in the introduction). It states 
that the “genetic code” of a holomorphic function is determined (that 
is, the function is fixed) if we know its values on appropriate arbitrarily 
small subsets. Note that in the theorem below, f] is assumed connected. 
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Theorem 4.8 Suppose f is a holomorphic function in a region f] that 
vanishes on a sequence of distinct points with a limit point in f]. Then 
f is identically 0. 

In other words, if the zeros of a holomorphic function / in the con¬ 
nected open set f] accumulate in f], then / = 0 . 

Proof. Suppose that 2 ：o G is a limit point for the sequence {wk} ( ^ =1 
and that f{wk) = 0. First, we show that / is identically zero in a small 
disc containing zq. For that, we choose a disc D centered at Zq and 
contained in f], and consider the power series expansion of / in that disc 

oo 

f(z) = ^a n (z - z 0 ) n . 

n=0 

If / is not identically zero, there exists a smallest integer m such that 
a m 7 ^ 0. But then we can write 

f{ z ) = ‘(z — 2 ：o) m (l + g(z — 2 ： 0 )), 

where g(z — zq) converges to 0 as z > zq. Taking z = Wk ^ zq for a se¬ 
quence of points converging to Zo, we get a contradiction since 
— zo) 171 0 and 1 + g(w k - z 0 ) + 0 , but f(w k ) = 0 . 

We conclude the proof using the fact that is connected. Let U 
denote the interior of the set of points where f(z) = 0. Then U is open 
by definition and non-empty by the argument just given. The set U is 
also closed since if z n E U and z n — z 、 then f(z) = 0 by continuity, and 
f vanishes in a neighborhood of 2 ： by the argument above. Hence z EU. 
Now if we let V denote the complement of U in f], we conclude that U 
and V are both open, disjoint, and 

n = uuv. 

Since f] is connected we conclude that either [/ or ^ is empty. (Here we 
use one of the two equivalent definitions of connectedness discussed in 
Chapter 1 .) Since zq E U, we find that U = Q and the proof is complete. 


An immediate consequence of the theorem is the following. 

Corollary 4.9 Suppose f and g are holomorphic in a region and 
f(z) = g(z) for all z in some non-empty open subset of (or more gen¬ 
erally for z in some sequence of distinct points with limit point in ). 
Then f(z) = g(z) throughout 


5. Further applications 


53 


Suppose we are given a pair of functions / and F analytic in regions 
f] and f]’，respectively, with f] C f]’. If the two functions agree on the 
smaller set f], we say that F is an analytic continuation of / into the 
region f]’. The corollary then guarantees that there can be only one such 
analytic continuation, since F is uniquely determined by /. 

5 Further applications 

We gather in this section various consequences of the results proved so 
far. 

5.1 Morera’s theorem 

A direct application of what was proved here is a converse of Cauchy’s 
theorem. 

Theorem 5.1 Suppose f is a continuous function in the open disc D 
such that for any triangle T contained in D 



then f is holomorphic. 

Proof. By the proof of Theorem 2.1 the function / has a primitive F 
in D that satisfies F f = /. By the regularity theorem, we know that F 
is indefinitely (and hence twice) complex differentiable, and therefore / 
is holomorphic. 

5.2 Sequences of holomorphic functions 

Theorem 5.2 If {/ n }^ =1 is a sequence of holomorphic functions that 
converges uniformly to a function f in every compact subset of Q，then 
f is holomorphic in f]. 

Proof. Let D be any disc whose closure is contained in f] and T 
any triangle in that disc. Then, since each f n is holomorphic, Goursat’s 
theorem implies 



By assumption / n —> •/ uniformly in the closure of D, so / is continuous 
and 
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As a result, we find J T f(z) dz = 0, and by Morera’s theorem, we conclude 
that / is holomorphic in D. Since this conclusion is true for every D 
whose closure is contained in f], we find that / is holomorphic in all of 


This is a striking result that is obviously not true in the case of real 
variables: the uniform limit of continuously differentiable functions need 
not be differentiable. For example, we know that every continuous func¬ 
tion on [0,1] can be approximated uniformly by polynomials, by Weier- 
strass’s theorem (see Chapter 5, Book I), yet not every continuous func¬ 
tion is differentiable. 

We can go one step further and deduce convergence theorems for the 
sequence of derivatives. Recall that if / is a power series with radius 
of convergence i?, then / / can be obtained by differentiating term by 
term the series for f, and moreover f has radius of convergence R. (See 
Theorem 2.6 in Chapter 1.) This implies in particular that if S n are the 
partial sums of /, then S’ n converges uniformly to f f on every compact 
subset of the disc of convergence of /. The next theorem generalizes this 
fact. 

Theorem 5.3 Under the hypotheses of the previous theorem, the se¬ 
quence of derivatives {fn}^=i converges uniformly to f on every com¬ 
pact subset of O. 

Proof. We may assume without loss of generality that the sequence of 
functions in the theorem converges uniformly on all of Q. Given 5 > 0, 
let Qs denote the subset of defined by 


= { 2 ： e : ~D~s{z) C f2}. 


In other words, consists of all points in f] which are at distance > 5 
from its boundary. To prove the theorem, it suffices to show that {/^} 
converges uniformly to f r on for each 5. This is achieved by proving 
the following inequality: 


sup l^^l^isup |F(C)| 


(13) 


zen s 0 


whenever F is holomorphic in f], since it can then be applied to 
F = f n — f to prove the desired fact. The inequality (13) follows at 
once from the Cauchy integral formula and the definition of since for 
every z E Qs the closure of Ds(z) is contained in Q and 
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Hence, 




< 


2tt 


sup 1^(01 ^2 2?r5 


Cen 


七剛， 


as was to be shown. 

Of course, there is nothing special about the first derivative, and in 
fact under the hypotheses of the last theorem, we may conclude (arguing 
as above) that for every A: > 0 the sequence of k th derivatives {/f)} 二 =i 
converges uniformly to /( fc ) on every compact subset of f]. 

In practice, one often uses Theorem 5.2 to construct holomorphic func¬ 
tions (say, with a prescribed property) as a series 

oo 

( 14 ) F(z)^^2f n (z). 

n=l 

Indeed, if each f n is holomorphic in a given region f] of the complex 
plane, and the series converges uniformly in compact subsets of fi, then 
Theorem 5.2 guarantees that F is also holomorphic in f]. For instance, 
various special functions are often expressed in terms of a converging 
series like (14). A specific example is the Riemann zeta function discussed 
in Chapter 6. 

We now turn to a variant of this idea, which consists of functions 
defined in terms of integrals. 


5.3 Holomorphic functions defined in terms of integrals 

As we shall see later in this book, a number of other special functions 
are defined in terms of integrals of the type 


m 



F(z ， s) ds, 


or as limits of such integrals. Here, the function F is holomorphic in the 
first argument, and continuous in the second. The integral is taken in 
the sense of Riemann integration over the bounded interval [a, b]. The 
problem then is to establish that / is holomorphic. 



56 


Chapter 2. CAUCHY’S THEOREM AND ITS APPLICATIONS 


In the next theorem, we impose a sufficient condition on F, often 
satisfied in practice, that easily implies that / is holomorphic. 

After a simple linear change of variables, we may assume that a = 0 
and 6=1. 

Theorem 5.4 Let F(z, s) be defined for (z, s) G x [0,1] where f] is an 
open set in C. Suppose F satisfies the following properties: 

(i) F(z ， s) is holomorphic in z for each s. 

(ii) F is continuous on Q x [0,1]. 

Then the function f defined on by 



is holomorphic. 

The second condition says that F is jointly continuous in both argu¬ 
ments. 

To prove this result, it suffices to prove that / is holomorphic in any 
disc D contained in f], and by Morera’s theorem this could be achieved 
by showing that for any triangle T contained in D we have 



Interchanging the order of integration, and using property (i) would then 
yield the desired result. We can, however, get around the issue of justi¬ 
fying the change in the order of integration by arguing differently. The 
idea is to interpret the integral as a “uniform” limit of Riemann sums, 
and then apply the results of the previous section. 

Proof. For each n > 1, we consider the Riemann sum 


fn{z) = (l/n)'^2 l F{z,k/n). 


Then f n is holomorphic in all of f] by property (i), and we claim that 
on any disc D whose closure is contained in f], the sequence {/ n }^=i 
converges uniformly to /. To see this, we recall that a continuous function 
on a compact set is uniformly continuous, so if e > 0 there exists 5 > 0 
such that 


sup \F(z, si) — F(z, S 2 )| < e whenever |^i — S 2 \ < S. 
zeD 
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Then, if n > 1/5, and z E D we have 


\fn(z) 



< 2^ / \F(z, k/n) - F(z,s)\ ds 

k=l ^( fc_1 )/ n 


k= 1 J(k-l)/n 
n pk/n 


n pk/n 


F(z, k/n) — F(z,s) ds 



< e. 


This proves the claim, and by Theorem 5.2 we conclude that / is holo- 
morphic in D. As a consequence, / is holomorphic in f], as was to be 
shown. 

5.4 Schwarz reflection principle 

In real analysis, there are various situations where one wishes to extend 
a function from a given set to a larger one. Several techniques exist 
that provide extensions for continuous functions, and more generally for 
functions with varying degrees of smoothness. Of course, the difficulty of 
the technique increases as we impose more conditions on the extension. 

The situation is very different for holomorphic functions. Not only are 
these functions indefinitely differentiable in their domain of definition, 
but they also have additional characteristically rigid properties, which 
make them difficult to mold. For example, there exist holomorphic func¬ 
tions in a disc which are continuous on the closure of the disc, but which 
cannot be continued (analytically) into any region larger than the disc. 
(This phenomenon is discussed in Problem 1.) Another fact we have 
seen above is that holomorphic functions must be identically zero if they 
vanish on small open sets (or even, for example, a non-zero line segment). 

It turns out that the theory developed in this chapter provides a simple 
extension phenomenon that is very useful in applications: the Schwarz 
reflection principle. The proof consists of two parts. First we define the 
extension, and then check that the resulting function is still holomorphic. 
We begin with this second point. 

Let be an open subset of C that is symmetric with respect to the 
real line, that is 


z G if and only if ^ G f2. 
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广 ㈤ 


is holomorphic on all off]. 


Let denote the part of f] that lies in the upper half-plane and 
that part that lies in the lower half-plane. 



Figure 11. An open set symmetric across the real axis 


Also, let I = Cl DM, so that / denotes the interior of that part of the 
boundary of and Or that lies on the real axis. Then we have 


f] + U / U O" 


n 


and the only interesting case of the next theorem occurs, of course, when 
/is non-empty. 

Theorem 5.5 (Symmetry principle) ///+ and f- are holomorphic 
functions in f]~*~ and Or respectively, that extend continuously to I and 

f + (x)m f~(x) for all xe I, 

then the function f defined on f] by 


+ I 

a J’o 

GGG 
z z z 

# # V 


Proof. One notes first that / is continuous throughout f]. The only 
difficulty is to prove that / is holomorphic at points of I. Suppose D is a 
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disc centered at a point on / and entirely contained in f]. We prove that 
/is holomorphic in D by Morera’s theorem. Suppose T is a triangle in 
D. If T does not intersect /, then 



f(z) dz^o 


since / is holomorphic in the upper and lower half-discs. Suppose now 
that one side or vertex of T is contained in /, and the rest of T is in, 
say, the upper half-disc. If T e is the triangle obtained from T by slightly 
raising the edge or vertex which lies on /, we have J T / = 0 since T e is 
entirely contained in the upper half-disc (an illustration of the case when 
an edge lies on / is given in Figure 12(a)). We then let e —• 0, and by- 
continuity we conclude that 



f(z) dz = 0. 






Figure 12. (a) Raising a vertex; (b) splitting a triangle 


If the interior of T intersects /, we can reduce the situation to the 
previous one by writing T as the union of triangles each of which has an 
edge or vertex on I as shown in Figure 12(b). By Morera’s theorem we 
conclude that / is holomorphic in D, as was to be shown. 

We can now state the extension principle, where we use the above 
notation. 
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Theorem 5.6 (Schwarz reflection principle) Suppose that f is a holo- 
morphic function in that extends continuously to I and such that f 
is real-valued on I. Then there exists a function F holomorphic in all of 
f] such that F = f on 

Proof. The idea is simply to define F(z) for 2 ： G f2 _ by 

F(z )= 丽 

To prove that F is holomorphic in we note that if 2 :, zq G then 
z,ZoG and hence, the power series expansion of / near zo gives 

f(z)^^2a n (z-z^) n . 

As a consequence we see that 

F(z) = - z 0 ) n 

and F is holomorphic in f] _ . Since / is real valued on I we have f(x)= 
f(x) whenever x G / and hence F extends continuously up to I. The 
proof is complete once we invoke the symmetry principle. 

5.5 Runge’s approximation theorem 

We know by Weierstrass’s theorem that any continuous function on a 
compact interval can be approximated uniformly by polynomials. 4 With 
this result in mind, one may inquire about similar approximations in 
complex analysis. More precisely, we ask the following question: what 
conditions on a compact set K C C guarantee that any function holo¬ 
morphic in a neighborhood of this set can be approximated uniformly by 
polynomials on K? 

An example of this is provided by power series expansions. We recall 
that if / is a holomorphic function in a disc D, then it has a power series 
expansion f(z) = a n^ n that converges uniformly on every compact 

set K C D. By taking partial sums of this series, we conclude that / can 
be approximated uniformly by polynomials on any compact subset of D. 

In general, however, some condition on K must be imposed, as we see 
by considering the function f(z) = 1/z on the unit circle K = C. Indeed, 
recall that f c f(z) dz = and if p is any polynomial, then Cauchy’s 
theorem implies f c p(z) dz = 0, and this quickly leads to a contradiction. 


4 A proof may be found in Section 1.8, Chapter 5, of Book I. 
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A restriction on K that guarantees the approximation pertains to the 
topology of its complement: K c must be connected. In fact, a slight mod¬ 
ification of the above example when f(z) = 1/z proves that this condition 
on K is also necessary; see Problem 4. 

Conversely, uniform approximations exist when K c is connected, and 
this result follows from a theorem of Runge which states that for any K 
a uniform approximation exists by rational functions with “singularities” 
in the complement of K. 5 This result is remarkable since rational func¬ 
tions are globally defined, while / is given only in a neighborhood of K. 
In particular, / could be defined independently on different components 
of K ， making the conclusion of the theorem even more striking. 

Theorem 5.7 Any function holomorphic in a neighborhood of a compact 
set K can be approximated uniformly on K by rational functions whose 
singularities are in K c . 

If K c is connected, any function holomorphic in a neighborhood of K 
can be approximated uniformly on K by polynomials. 

We shall see how the second part of the theorem follows from the 
first: when K c is connected, one can “push” the singularities to infinity 
thereby transforming the rational functions into polynomials. 

The key to the theorem lies in an integral representation formula that is 
a simple consequence of the Cauchy integral formula applied to a square. 

Lemma 5.8 Suppose f is holomorphic in an open set Q, and K C is 
compact. Then, there exists finitely many segments 71 ， , 7 ^ in — K 
such that 

(15) f(z) = for all z e K. 

Proof. Let d = c • d(K, f2 c ), where c is any constant < 1/v^, and 
consider a grid formed by (solid) squares with sides parallel to the axis 
and of length d. 

We let Q = {Qi, …， Qm} denote the finite collection of squares in 
this grid that intersect K, with the boundary of each square given the 
positive orientation. (We denote by dQ m the boundary of the square 
Qm-) Finally, we let 71 , … ， 7jv denote the sides of squares in Q that do 
not belong to two adjacent squares in Q. (See Figure 13.) The choice of 
d guarantees that for each n, 7 n C and 7 n does not intersect K\ for if 
it did, then it would belong to two adjacent squares in Q, contradicting 
our choice of 7 n . 


5 These singularities are points where the function is not holomorphic, and are “poles”, 
as defined in the next chapter. 
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Figure 13. The union of the 7 n ’s is in bold-face 


Since for any z ^ K that is not on the boundary of a square in Q there 
exists j so that z G Qj, Cauchy’s theorem implies 



if m = j, 
if j. 


Thus, for all such 之 we have 


M 

m=l 



/(C) 

C-z 


dC- 


However, if Q m and are adjacent, the integral over their common side 
is taken once in each direction, and these cancel. This establishes (15) 
when 2 ： is in K and not on the boundary of a square in Q. Since r ) n C K c , 
continuity guarantees that this relation continues to hold for all z E 
as was to be shown. 


The first part of Theorem 5.7 is therefore a consequence of the next 
lemma. 


Lemma 5.9 For any line segment 7 entirely contained m f] — K, there 
exists a sequence of rational functions with singularities on 7 that ap¬ 
proximate the integral /(C)/(C — z ) uniformly on K. 

Proof. If 7 (t) : [0,1] —> C is a parametrization for 7 , then 



m 

C-z 


dC 



/(7 ⑴） 

7 ⑴ 一 2 


7 ， (t) dt. 
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Since 7 does not intersect K, the integrand F(z,t) in this last integral 
is jointly continuous on K x [0,1], and since K is compact, given e > 0, 
there exists <5 > 0 such that 

sup \F(z,ti) — F(z,t 2 )\ < e whenever I — t 2 \ <5. 
zeK 

Arguing as in the proof of Theorem 5.4, we see that the Riemann sums 
of the integral F(z, t) dt approximate it uniformly on K. Since each 
of these Riemann sums is a rational function with singularities on 7 , the 
lemma is proved. 

Finally, the process of pushing the poles to infinity is accomplished by 
using the fact that K c is connected. Since any rational function whose 
only singularity is at the point Zo is a polynomial in l/(z — 2 ： o), it suffices 
to establish the next lemma to complete the proof of Theorem 5.7. 

Lemma 5.10 If K c is connected and zo ^ K, then the function 
l/(z _ zq) can be approximated uniformly on K by polynomials. 

Proof. First, we choose a point z\ that is outside a large open disc D 
centered at the origin and which contains K. Then 



z — &— Wi — 白 # +1 


where the series converges uniformly for z E K. The partial sums of 
this series are polynomials that provide a uniform approximation to 
1 /( 2 ： — z\) on K. In particular, this implies that any power l/(z — z\) k 
can also be approximated uniformly on K by polynomials. 

It now suffices to prove that l/(z — zq) can be approximated uniformly 
on K by polynomials in l/(z — z\). To do so, we use the fact that K c is 
connected to travel from zo to the point z\. Let 7 be a curve in K c that 
is parametrized by ^{t) on [ 0 , 1 ], and such that 7 ( 0 ) = Zq and 7 ( 1 ) = Z\. 
If we let p = |d(if ， 7 ), then p > 0 since 7 and K are compact. We then 
choose a sequence of points {wi ,..., w^} on 7 such that wq = zo^ W£ = 
and < P for all 0 < j < £• 

We claim that if 1 (； is a point on 7 , and w f any other point with 
\w — w r \ < then l/(z — w) can be approximated uniformly on K by 
polynomials in l/(z — w f ). To see this, note that 

1 _ 1 1 

z — w z — w f I — w ~ w， 

z_w’ 

_y. (W-WT 
(z — 忉 ’) n+1 ’ 
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and since the sum converges uniformly for 2 : G X, the approximation by 
partial sums proves our claim. 

This result allows us to travel from zo to z\ through the finite sequence 
{wj} to find that l/(z — zo) can be approximated uniformly on K by 
polynomials in l/(z — z±). This concludes the proof of the lemma, and 
also that of the theorem. 


6 Exercises 


1. Prove that 


sin(a; 2 ) dx : 


cos(x 2 ) dx : 


V2tt 

4 


These are the Fresnel integrals. Here, J 0 °° is interpreted as lim_R —00 

[Hint: Integrate the function e~ z2 over the path in Figure 14. Recall that 

IZo ^ dx = 



2. Show that 


• dx : 


7T 

2 ' 


[Hint: The integral equals ^ 〆:- 1 dx. Use the indented semicircle.] 


3. Evaluate the integrals 



— ax 7 1 

e cos bx dx 


and 


/»00 

/ e~ ax sin bxdx , 

Jo 


a > 0 


by integrating e~ Az , A = -\/a 2 + 6 2 , over an appropriate sector with angle cj, with 
cosa; = a! A. 
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4. Prove that for all ^ G C we have e— 


ttx z ^2nix^ 


dx. 


5. Suppose / is continuously complex differentiable on and T C Q is a triangle 
whose interior is also contained in Apply Green’s theorem to show that 



f(z) dz = 0. 


This provides a proof of Goursat’s theorem under the additional assumption that 
/’is continuous. 

[Hint: Green’s theorem says that if (F, G) is a continuously differentiable vector 
field, then 


Fdx-\-Gdy 


/Interior of T 


(尝 - 署) 


dxdy. 


For appropriate F and G, one can then use the Cauchy-Riemann equations.] 


6. Let Q be an open subset of C and let T C be a triangle whose interior is also 
contained in 17. Suppose that / is a function holomorphic in Q except possibly at 
a point w inside T. Prove that if / is bounded near w, then 



f(z) dz = 0. 


7. Suppose / : D ^ C is holomorphic. Show that the diameter d = 
sup 2) \f(z) — f(w)\ of the image of / satisfies 

2|/'(0)| < d. 

Moreover, it can be shown that equality holds precisely when / is linear, f(z)= 
ao + aiz. 

Note. In connection with this result, see the relationship between the diameter of 
a curve and Fourier series described in Problem 1, Chapter 4, Book I. 

[Hint: 2/(0) = ^ / |c — /(0 ~^ ( ~° d( whenever 0 < r < 1.] 

8. If / is a holomorphic function on the strip —1 < / i / < 1, x G R with 

|/( 2 ：)| < A(1 + \z\) v , r] a fixed real number 
for all 2： in that strip, show that for each integer n > 0 there exists A n > 0 so that 
|/( n )(x)| $ A n (l + l^l) 77 , for all a: € M. 


[Hint: Use the Cauchy inequalities.] 
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9. Let Q be a bounded open subset of C, and (p : Q ^ Q a, holomorphic function. 
Prove that if there exists a point zq £0, such that 

(f(zo) = Zo and ^(zq) = 1 


then ip is linear. 

[Hint: Why can one assume that zo = 0? Write (f(z) = z a n z n + 0(z n+1 ) near 
0 , and prove that if p o ... o p (where (p appears k times), then = 

z + ka-nZ 71 + 0(z n+1 ). Apply the Cauchy inequalities and let /c ^ oo to conclude 
the proof. Here we use the standard O notation, where f(z) = 0(g(z)) as 2 ： 一 0 
means that \f(z)\ < C\g(z)\ for some constant C as \z\ 0.] 

10 . Weierstrass’s theorem states that a continuous function on [ 0 , 1 ] can be uni¬ 
formly approximated by polynomials. Can every continuous function on the closed 
unit disc be approximated uniformly by polynomials in the variable zl 


11. Let / be a holomorphic function on the disc Dr q centered at the origin and 
of radius Rq. 


(a) Prove that whenever 0 < R < Rq and | 2 ：| < R, then 


= 



f(Re iip )Re 


\ Re— — z ) 


dip. 


(b) Show that 


Re 


/^ 7 +r\ 

V Reh -r) 


R 2 -r 2 

R 2 — 2Rr cos 7 + r 2 


[Hint: For the first part, note that if w = B ?then the integral of /(C)/(C — w ) 
around the circle of radius R centered at the origin is zero. Use this, together with 
the usual Cauchy integral formula, to deduce the desired identity.] 


12. Let w be a real-valued function defined on the unit disc D. Suppose that u is 
twice continuously differentiable and harmonic, that is, 

Au(x, y) = 0 

for all (x, y) G D. 

(a) Prove that there exists a holomorphic function / on the unit disc such that 

Re(/) = u. 


Also show that the imaginary part of / is uniquely defined up to an additive 
(real) constant. [Hint: From the previous chapter we would have f,(z) = 
2du/dz. Therefore, let g(z) = 2du/dz and prove that g is holomorphic. 
Why can one find F with F' = gl Prove that Re(F) differs from w by a real 
constant.] 
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(b) Deduce from this result, and from Exercise 11, the Poisson integral repre¬ 
sentation formula from the Cauchy integral formula: If u is harmonic in the 
unit disc and continuous on its closure, then ii z = re xd one has 


u(z) 


2?r 


P r (0 — (p)u(ip) d(p 


where _P r ( 7 ) is the Poisson kernel for the unit disc given by 


Pr{l) 


1 — 2 r cos 7 + r 2 


13. Suppose / is an analytic function defined everywhere in C and such that for 
each 2 ：o € C at least one coefficient in the expansion 

oo 

/ ⑷ = ^2cn(z- Z 0 ) n 
n=0 

is equal to 0. Prove that / is a polynomial. 

[Hint: Use the fact that c n n\ = /( n )( 2 ： o) and use a countability argument.] 

14. Suppose that / is holomorphic in an open set containing the closed unit disc, 
except for a pole at zo on the unit circle. Show that if 


oo 

E 


Oju 


n 

Z 


denotes the power series expansion of / in the open unit disc, then 


lim 

n ― >oo 


ttn+1 


^ Zq . 


15. Suppose / is a non-vanishing continuous function on D that is holomorphic in 
D. Prove that if 

|/( 2 )| = 1 whenever |z| = 1, 

then / is constant. 

[Hint: Extend / to all of C by f(z) = 1/f(l/z) whenever |z| > 1, and argue as in 
the Schwarz reflection principle.] 


7 Problems 

1. Here are some examples of analytic functions on the unit disc that cannot be 
extended analytically past the unit circle. The following definition is needed. Let 
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/ be a function defined in the unit disc D, with boundary circle C. A point w on C 
is said to be regular for / if there is an open neighborhood U of w and an analytic 
function g on C7, so that / = p on D fl C/. A function / defined on B cannot be 
continued analytically past the unit circle if no point of C is regular for /. 

(a) Let 

oo 

f( z ) = z2U for I 爿 < 1. 

71=0 

Notice that the radius of convergence of the above series is 1. Show that 
f cannot be continued analytically past the unit disc. [Hint: Suppose 
0 = 2np/2 k , where p and k are positive integers. Let 2 = re 10 ; then 
\f(re l6 )\ ― >• oo as r — 1.] 

(b) * Fix 0 < a < oo. Show that the analytic function / defined by 

oo 

f(z) = 2~ noi z 2 for |a:| < 1 

n=0 

extends continuously to the unit circle, but cannot be analytically continued 
past the unit circle. [Hint: There is a nowhere differentiable function lurking 
in the background. See Chapter 4 in Book I.] 


2.* Let 


m = E d(n)z n for \z\ < 1 


where d(n) denotes the number of divisors of n. Observe that the radius of con¬ 
vergence of this series is 1. Verify the identity 


Y, d{n 、 zU = 


Using this identity, show that \i z = r with 0 < r < 1, then 
\ F (r)\ > c—^— log(l/(l - r)) 

丄 一 r 

as r ^ 1. Similarly, if ^ = 2np/q where p and q are positive integers and 2 = re xe , 
then 

\ F (re i6 )\ > c p/q j^\og{l/{l-r)) 

as r ^ 1. Conclude that F cannot be continued analytically past the unit disc. 


3. Morera^ theorem states that if / is continuous in C, and f T f(z) dz = 0 for all 
triangles T, then / is holomorphic in C. Naturally, we may ask if the conclusion 
still holds if we replace triangles by other sets. 
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(a) Suppose that / is continuous on C, and 


(16) 


[f(z) dz = 0 

Jc 


for every circle C. Prove that / is holomorphic. 

(b) More generally, let T be any toy contour, and T the collection of all trans¬ 
lates and dilates of r. Show that if / is continuous on C, and 



then / is holomorphic. In particular, Morera’s theorem holds under the 
weaker assumption that f T f(z) dz = 0 for all equilateral triangles. 

[Hint: As a first step, assume that / is twice real differentiable, and write f(z) = 
f(zo) + a(z — zo) + b(z — zo) + 0{\z — zq\ 2 ) for z near zo. Integrating this expan¬ 
sion over small circles around zo yields df /dz = b = 0 dX zq. Alternatively, suppose 
only that / is differentiable and apply Green’s theorem to conclude that the real 
and imaginary parts of / satisfy the Cauchy-Riemann equations. 

In general, let = ^p(x, y) (when w = x -\- iy) denote a smooth function with 
0 < (f(w) < 1, and f R2 dV{w) = 1, where dV(w) = dxdy, and f denotes the 
usual integral of a function of two variables in R 2 . For each e > 0, let <p e (z)= 
e _ ： V( e -S), as well as 



where the integral denotes the usual integral of functions of two variables, with 
dV(w) the area element of R 2 . Then f e is smooth, satisfies condition (16), and 
/ e —> •/ uniformly on any compact subset of C.] 

4. Prove the converse to Runge’s theorem: if X is a compact set whose complement 
if not connected, then there exists a function / holomorphic in a neighborhood of 
K which cannot be approximated uniformly by polynomial on K. 

[Hint: Pick a point in a bounded component of K c , and let f(z) = l/(z — 2 ： o). 
If / can be approximated uniformly by polynomials on K, show that there exists a 
polynomial p such that | (z — zo)p(z) — 1| < 1. Use the maximum modulus principle 
(Chapter 3) to show that this inequality continues to hold for all 2 ： in the component 
of K c that contains zo，] 

5. * There exists an entire function F with the following “universal” property: given 
any entire function h, there is an increasing sequence {Nk}^=i of positive integers, 
so that 


lim F(z + Nk) = h(z) 


uniformly on every compact subset of C. 
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(a) Let pi,p 2 , ••- denote an enumeration of the collection of polynomials whose 
coefficients have rational real and imaginary parts. Show that it suffices 
to find an entire function F and an increasing sequence {M n } of positive 
integers, such that 


(17) 


\F( Z )-p n ( Z -M n )\<^- 


whenever z G D n , 


where D n denotes the disc centered at M n and of radius n. [Hint: Given 
h entire, there exists a sequence {nk} such that limfc-.oo p nk (z) = h(z) uni¬ 
formly on every compact subset of C.] 

(b) Construct F satisfying (17) as an infinite series 

oo 

F {z) = ^2u n (z) 


where u n (z) = p n {z — M n )e _Cn ( z_Mn ) , and the quantities c n > 0 and M n > 
0 are chosen appropriately with c n ^ 0 and M n — oo. [Hint: The function 
e~ z vanishes rapidly as | 2 ：| —>• oo in the sectors {| arg 2 :| < 7r/4 — <5} and 
{|7r — arg 2 ：I < 7r/4 — J}.] 

In the same spirit, there exists an alternate “universal” entire function G with 
the following property: given any entire function h, there is an increasing sequence 
{Nk}kLi of positive integers, so that 

lim D Nk G(z) = h(z) 

k — >oo 


uniformly on every compact subset of C. Here D^G denotes the ^ (complex) 
derivative of G. 


Meromorphic Functions and 
the Logarithm 


One knows that the differential calculus, which has 
contributed so much to the progress of analysis, is 
founded on the consideration of differential coefficients, 
that is derivatives of functions. When one attributes 
an infinitesimal increase e to the variable rr, the func¬ 
tion f(x) of this variable undergoes in general an in¬ 
finitesimal increase of which the first term is propor¬ 
tional to e, and the finite coefficient of e of this in¬ 
crease is what is called its differential coefficient... If 
considering the values of x where f{x) becomes infi¬ 
nite, we add to one of these values designated by rri, 
the infinitesimal e, and then develop f{x\ + e) in in¬ 
creasing power of the same quantity, the first terms 
of this development contain negative powers of e; one 
of these will be the product of 1/e with a finite coef¬ 
ficient, which we will call the residue of the function 
/(rr), relative to the particular value rri of the variable 
x. Residues of this kind present themselves naturally 
in several branches of algebraic and infinitesimal anal¬ 
ysis. Their consideration furnish methods that can be 
simply used, that apply to a large number of diverse 
questions, and that give new formulae that would seem 
to be of interest to mathematicians." 

A. L. Cauchy, 1826 


There is a general principle in the theory, already implicit in Riemann’s 
work, which states that analytic functions are in an essential way charac¬ 
terized by their singularities. That is to say, globally analytic functions 
are “effectively” determined by their zeros, and meromorphic functions 
by their zeros and poles. While these assertions cannot be formulated 
as precise general theorems, there are nevertheless significant instances 
where this principle applies. 

We begin this chapter by considering singularities, in particular the 
different kind of point singularities (“isolated” singularities) that a holo- 
morphic function can have. In order of increasing severity, these are: 


• removable singularities 
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• poles 

• essential singularities. 

The first type is harmless since a function can actually be extended 
to be holomorphic at its removable singularities (hence the name). Near 
the third type, the function oscillates and may grow faster than any 
power, and a complete understanding of its behavior is not easy. For the 
second type the analysis is more straight forward and is connected with 
the calculus of residues, which arises as follows. 

Recall that by Cauchy’s theorem a holomorphic function / in an open 
set which contains a closed curve 7 and its interior satisfies 



The question that occurs is: what happens if / has a pole in the interior 
of the curve? To try to answer this question consider the example f(z) = 
1/z, and recall that if (7 is a (positively oriented) circle centered at 0, 
then 



This turns out to be the key ingredient in the calculus of residues. 

A new aspect appears when we consider indefinite integrals of holomor¬ 
phic functions that have singularities. As the basic example f(z) = 1/z 
shows, the resulting “fimction”（in this case the logarithm) may not be 
single-valued, and understanding this phenomenon is of importance for 
a number of subjects. Exploiting this multi-valuedness leads in effect to 
the u argument principle.” We can use this principle to count the number 
of zeros of a holomorphic function inside a suitable curve. As a simple 
consequence of this result, we obtain a significant geometric property of 
holomorphic functions: they are open mappings. From this, the maxi¬ 
mum principle, another important feature of holomorphic functions, is 
an easy step. 

In order to turn to the logarithm itself, and come to grips with the 
precise nature of its multi-valuedness, we introduce the notions of homo- 
topy of curves and simply connected domains. It is on the latter type of 
open sets that single-valued branches of the logarithm can be defined. 

1 Zeros and poles 

By definition, a point singularity of a function / is a complex number 
Zq such that / is defined in a neighborhood of Zq but not at the point 
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zo itself. We shall also call such points isolated singularities. For 
example, if the function / is defined only on the punctured plane by 
f(z) = z, then the origin is a point singularity. Of course, in that case, 
the function / can actually be defined at 0 by setting /(0) = 0, so that 
the resulting extension is continuous and in fact entire. (Such points 
are then called removable singularities.) More interesting is the case 
of the function g(z) = \/z defined in the punctured plane. It is clear 
now that g cannot be defined as a continuous function, much less as 
a holomorphic function, at the point 0. In fact, g(z) grows to infinity 
as z approaches 0 , and we shall say that the origin is a pole singularity. 
Finally, the case of the function h(z) = e 1 / 2 on the punctured plane shows 
that removable singularities and poles do not tell the whole story. Indeed, 
the function h(z) grows indefinitely as 么 approaches 0 on the positive real 
line, while h approaches 0 as 么 goes to 0 on the negative real axis. Finally 
h oscillates rapidly, yet remains bounded, as 2 ： approaches the origin on 
the imaginary axis. 

Since singularities often appear because the denominator of a frac¬ 
tion vanishes, we begin with a local study of the zeros of a holomorphic 
function. 

A complex number zq is a zero for the holomorphic function / if 
f( z o) — 0. In particular, analytic continuation shows that the zeros of 
a non-trivial holomorphic function are isolated. In other words, if / is 
holomorphic in Q and /(zq) = 0 for some zq G then there exists an 
open neighborhood U of zo such that f(z) 7 ^ 0 for all 2 ： G C/ — {^ 0 }, unless 
/is identically zero. We start with a local description of a holomorphic 
function near a zero. 

Theorem 1.1 Suppose that f is holomorphic in a connected open set Cl, 
has a zero at a point zq G and does not vanish identically in f]. Then 
there exists a neighborhood U C of zq, a non-vanishing holomorphic 
function g on U, and a unique positive integer n such that 

f(z) = - Z 0 ) n g(z) for all z eU. 

Proof. Since is connected and / is not identically zero, we conclude 
that / is not identically zero in a neighborhood of zq. In a small disc 
centered at zo the function / has a power series expansion 

00 

f(z) ^^2a k (z - Z 0 ) k . 
k=0 


Since / is not identically zero near zq, there exists a smallest integer n 
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such that a n ^ 0. Then, we can write 

/( Z ) = ( Z ~ z o) n [^n + CL n -\-i(z — Zo) + ...] = ( 么 — Zo) n g(z ), 

where g is defined by the series in brackets, and hence is holomorphic, 
and is nowhere vanishing for all 2 ： close to zo (since a n ^ 0). To prove 
the uniqueness of the integer n, suppose that we can also write 

/0) 二 （2 — Zo) n g(z) = 0 - z 0 ) m h(z) 

where h(zo) ^ 0. If m > n, then we may divide by (z — zo) n to see that 

g(z) = (z-z o r~ n h(z) 


and letting z ^ zq yields g(zo) = 0, a contradiction. If m < n a similar 
argument gives H{zq) = 0, which is also a contradiction. We conclude 
that m = n, thus h = g, and the theorem is proved. 

In the case of the above theorem, we say that / has a zero of order 
n (or multiplicity n) at zq. If a zero is of order 1, we say that it is 
simple. We observe that, quantitatively, the order describes the rate at 
which the function vanishes. 

The importance of the previous theorem comes from the fact that 
we can now describe precisely the type of singularity possessed by the 
function 1// at zq. 

For this purpose, it is now convenient to define a deleted neighbor¬ 
hood of zq to be an open disc centered at zq, minus the point Zq, that 
is, the set 

{z : 0 < \z — zo\ < r} 

for some r > 0. Then, we say that a function / defined in a deleted 
neighborhood of zo has a pole at zo, if the function 1//, defined to be 
zero at zo, is holomorphic in a full neighborhood of Zo, 

Theorem 1.2 If f has a pole at Zq E Q, then in a neighborhood of that 
point there exist a non-vanishing holomorphic function h and a unique 
positive integer n such that 

/(z) 二 (z - z 0 )~ n h(z). 

Proof. By the previous theorem we have 1/f(z) = (z — zo) n g(z), 
where g is holomorphic and non-vanishing in a neighborhood of zq, so 
the result follows with h(z) = l/g(z). 
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The integer n is called the order (or multiplicity) of the pole, and 
describes the rate at which the function grows near zq. If the pole is of 
order 1, we say that it is simple. 

The next theorem should be reminiscent of power series expansion, 
except that now we allow terms of negative order, to account for the 
presence of a pole. 


Theorem 1.3 If f has a pole of order n at zq, then 


⑴ / ⑷ 


(z - z 0 ) n (z - Zo) 


a ~ n+l + ■■■ + + G{z) 


n—1 


(z - Zo) 


where G is a holomorphic function in a neighborhood of zq. 


Proof. The proof follows from the multiplicative statement in the 
previous theorem. Indeed, the function h has a power series expansion 


"( 2 ：) = Aq Ai(z — Zq) + … 


so that 


/( z ) = ( z ~ Z o) n (^0 + — Zo) + ...) 


a—n a_ n +i 

(z - z 0 ) n (z - Zo ) 71 - 1 


a_i 

{z - Zq) 


+ G{z). 


The sum 

a—n n+1 CL—l 

(z- z 0 ) n (z - Zo) 11 - 1 (z - z 0 ) 

is called the principal part of / at the pole zo, and the coefficient a_i is 
the residue of / at that pole. We write res 2o / = a_i. The importance of 
the residue comes from the fact that all the other terms in the principal 
part, that is, those of order strictly greater than 1, have primitives in a 
deleted neighborhood of zq. Therefore, if P(z) denotes the principal part 
above and C is any circle centered at Zo, we get 

-~ : / P(z) dz = a_i. 

Jc 

We shall return to this important point in the section on the residue 
formula. 

As we shall see, in many cases, the evaluation of integrals reduces to 
the calculation of residues. In the case when / has a simple pole at Zq, 
it is clear that 

res~ 0 / = lim (z- z 0 )f(z). 
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If the pole is of higher order, a similar formula holds, one that involves 
differentiation as well as taking a limit. 


Theorem 1.4 If f has a pole of order n at zq, then 


res 2o / = 


lim - - 

z — >zq \Tl 一 


1 )! 



(z- Z 0 ) n f(z). 


The theorem is an immediate consequence of formula (1), which implies 


(z — zq) 71 f (z) = a_ n + a_ n+ i(z — Zq) + … + a-i(z — zo) n 1 + 


-\-G(z)(z — zo) n . 


2 The residue formula 

We now discuss the celebrated residue formula. Our approach follows the 
discussion of Cauchy’s theorem in the last chapter: we first consider the 
case of the circle and its interior the disc, and then explain generalizations 
to toy contours and their interiors. 


Theorem 2.1 Suppose that f is holomorphic in an open set containing 
a circle C and its interior, except for a pole at zq inside C. Then 

I f(z) dz = 2?rz res ZQ f. 

Jc 

Proof. Once again, we may choose a keyhole contour that avoids the 
pole, and let the width of the corridor go to zero to see that 



f(z)dz 



f(z) dz 


where C € is the small circle centered at the pole zq and of radius e. 
Now we observe that 


2ni 



Q-i 

z - z 0 


dz = a_i 


is an immediate consequence of the Cauchy integral formula (Theo¬ 
rem 4.1 of the previous chapter), applied to the constant function / = 
a_i. Similarly, 


2ni 


a—k 


(z- z 0 y 


dz = 0 
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when k > 1 ， by using the corresponding formulae for the derivatives 
(Corollary 4.2 also in the previous chapter). But we know that in a 
neighborhood of zq we can write 


f(Z ) 二 


a—n + n+1 

{z- Z 0 ) n (z- Zq ) 71 - 1 



+ G(z), 


where G is holomorphic. By Cauchy’s theorem, we also know that 
f c G(z) dz = 0, hence f c f(z) dz = a_i. This implies the desired re¬ 
sult. 


This theorem can be generalized to the case of finitely many poles in 
the circle, as well as to the case of toy contours. 

Corollary 2.2 Suppose that f is holomorphic in an open set containing 
a circle C and its interior, except for poles at the points zi,, zn inside 
C. Then 



N 

f(z) dz ^ 27ri^res 2fc /. 
k=l 


For the proof, consider a multiple keyhole which has a loop avoiding 
each one of the poles. Let the width of the corridors go to zero. In 
the limit, the integral over the large circle equals a sum of integrals over 
small circles to which Theorem 2.1 applies. 

Corollary 2.3 Suppose that f is holomorphic in an open set containing 
a toy contour 7 and its interior, except for poles at the points zi” . ” zn 
inside 7 . Then 

N 

/ /(^) dz = 27rz^res^/. 

k=i 


In the above, we take 7 to have positive orientation. 

The proof consists of choosing a keyhole appropriate for the given toy 
contour, so that, as we have seen previously, we can reduce the situation 
to integrating over small circles around the poles where Theorem 2.1 
applies. 

The identity f(z) dz = 2iri res 2fc / is referred to as the residue 
formula. 


2.1 Examples 

The calculus of residues provides a powerful technique to compute a 
wide range of integrals. In the examples we give next, we evaluate three 
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improper Riemann integrals of the form 



f(x) dx. 


The main idea is to extend / to the complex plane, and then choose a 
family 7 ^ of toy contours so that 


lim 

R-^oo 



f(z)dz 



f(x) dx. 


By computing the residues of / at its poles, we easily obtain f(z) dz. 
The challenging part is to choose the contours 7 _r, so that the above limit 
holds. Often, this choice is motivated by the decay behavior of /. 

Example 1. First, we prove that 


( 2 ) 



dx 

1 + x 2 


by using contour integration. Note that if we make the change of variables 
x ㈠ x/y, this yields 


7T 



y dx 
y 2 + x 2 



V y (x) dx. 


In other words, formula (2) says that the integral of the Poisson kernel 
V y {x) is equal to 1 for each y > 0. This was proved quite easily in 
Lemma 2.5 of Chapter 5 in Book I, since 1/(1 + x 2 ) is the derivative of 
arctanx. Here we provide a residue calculation that leads to another 
proof of ( 2 ). 

Consider the function 


m 


1 + z 2 


which is holomorphic in the complex plane except for simple poles at the 
points i and —i. Also, we choose the contour 7 丑 shown in Figure 1. The 
contour consists of the segment [—i?, R] on the real axis and of a large 
half-circle centered at the origin in the upper half-plane. 

Since we may write 




[z - i)(z + i) 
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we see that the residue of / at i is simply l/2i. Therefore, if R is large 
enough, we have 


f(z) dz = 每二 I 


If we denote by the large half-circle of radius i?, we see that 


L mdz 


^ B M 


where we have used the fact that \f(z)\ < B/\z\ 2 when 2 : G and R is 
large. So this integral goes to 0 as R —^ 00 . Therefore, in the limit we 
find that 



dx 


= 丌， 


as desired. We remark that in this example, there is nothing special 
about our choice of the semicircle in the upper half-plane. One gets the 
same conclusion if one uses the semicircle in the lower half-plane, with 
the other pole and the appropriate residue. 


Example 2. An integral that will play an important role in Chapter 6 
is 

e ax 7T 

- - dx = -- ， 0 < a < 1. 

1 + e x sm 7ra 

To prove this formula, let f(z) = e az /(l + e z ), and consider the con¬ 
tour consisting of a rectangle in the upper half-plane with a side lying 
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2ni ^ R 


7TI 


R 0 R 

Figure 2. The contour 7^ in Example 2 


on the real axis, and a parallel side on the line Im(z) = 2tt, as shown in 
Figure 2. 

The only point in the rectangle 7 丑 where the denominator of / vanishes 
is 2: = 7rz. To compute the residue of / at that point, we argue as follows: 
First, note 


(z- ni)f(z ) 二 e a 


7TI 


7TI 


1 + e 2 


e z - e nl 


We recognize on the right the inverse of a difference quotient, and in fact 


p~ — u 

lim - — = e 1Tl = -l 

Z — 7TZ 

since e z is its own derivative. Therefore, the function / has a simple pole 
at 7ri with residue 


/ a 

=-e 


As a consequence, the residue formula says that 


(3) 


/ 


-2nie a 


'1R 


We now investigate the integrals of / over each side of the rectangle. Let 
Ir denote 


[f(x) dx 
J-R 


and I the integral we wish to compute, so that ^ / as i? —^ 00. Then, 
it is clear that the integral of / over the top side of the rectangle (with 
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the orientation from right to left) is 


^2iria 


Ir. 


Finally, if = {R + it : 0 <t < 2 丌 } denotes the vertical side on the 
right, then 



e a{R-\-it) 

1 + e R+it 


dt < Ce {a ~ 1)R , 


and since a < 1, this integral tends to 0 as i? —oo. Similarly, the integral 
over the vertical segment on the left goes to 0, since it can be bounded 
by Ce~ aR and a > 0. Therefore, in the limit as R tends to infinity, the 
identity (3) yields 

I - e 2nia I = -27rie a7Ti , 


from which we deduce 


-2iri 


1 _ g27rza 


27TZ 


sm 7ra 

and the computation is complete. 

Example 3. Now we calculate another Fourier transform, namely 

广 00 p — 2TTix^ 1 


COSh 7TX 


dx 


cosh 7T^ 


where 


cosh z 


e 2 + e~ 


In other words, the function 1/cosh7rx is its own Fourier transform, a 
property also shared by e~ nx (see Example 1, Chapter 2). To see this, 
we use a rectangle 7^ as shown on Figure 3 whose width goes to infinity, 
but whose height is fixed. 

For a fixed ^ G M, let 


/ ⑷二 


e -2niz€ 
COSh 7TZ 
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2i 加 


P 


R 0 R 


Figure 3. The contour 7 ^ in Example 3 


and note that the denominator of / vanishes precisely when e nz = —e _7r2： , 
that is, when e 2nz = —1. In other words, the only poles of / inside the 
rectangle are at the points a = i/2 and (3 = 3i/2. To find the residue of 
/ at a, we note that 


(z — 删：一 矣矣 

( 么 -a) 


2e~ 2niz ^e 


^ 2 ttz _ g27ra * 


We recognize on the right the reciprocal of the difference quotient for the 
function e 2nz at z = a. Therefore 


lim (z 

z—^a 


- a)f(z) 二 2e. 


-2tt ^ttol 


27re 271 



7TI 


which shows that / has a simple pole at a with residue e 71 - ^/(ttz). Simi¬ 
larly, we find that / has a simple pole at (3 with residue —e 37 r ^/( 7 rz). 

We dispense with the integrals of / on the vertical sides by showing 
that they go to zero as R tends to infinity. Indeed, if z = R iy with 
0 < 2 / < 2 , then 


e _27r 喊 1 〈 e 47r|^| 
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and 


I cosh 7Tz\ 


e nz + e~ 


2 


e …） 

oo as i? —> oo, 


^2 (C 


which shows that the integral over the vertical segment on the right goes 
to 0 as i? —> oo. A similar argument shows that the integral of / over 
the vertical segment on the left also goes to 0 as i? —> oo. Finally, we see 
that if / denotes the integral we wish to calculate, then the integral of / 
over the top side of the rectangle (with the orientation from right to left) 
is simply —e 4?r ^/ where we have used the fact that cosh 丌 (" is periodic 
with period 2i. In the limit as R tends to infinity, the residue formula 
gives 


I-e Avi I = 2ni 






7TZ 


-2e 27rS (e 碎 —e_ 吋 ), 


and since 1 — e 471 "^ = —e 27r ^(e 27r ^ — e _2?r ^), we find that 


e ni - e~ ni - e~ ni 2 1 

j —— 2 _= _=_=_ 

e 2 < - e~ 2 < (e^ - e _7r ^)(e 7r ^ + e _7r ^) e< + e~< cosh 

as claimed. 

A similar argument actually establishes the following formula: 

sin 7ra 2 sinh 27ra^ 

- dx = ——:-- 

cosh nx + cos 7ra sinh 

whenever 0 < a < 1, and where sinh 2 ： = (e z — e~ z )/2. We have proved 
above the particular case a = 1/2. This identity can be used to determine 
an explicit formula for the Poisson kernel for the strip (see Problem 3 in 
Chapter 5 of Book I), or to prove the sum of two squares theorem, as we 
shall see in Chapter 10. 



—2nix^ 


3 Singularities and meromorphic functions 

Returning to Section 1, we see that we have described the analytical 
character of a function near a pole. We now turn our attention to the 
other types of isolated singularities. 
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Let / be a function holomorphic in an open set f] except possibly at 
one point zq in f2. If we can define / at zo in such a way that / becomes 
holomorphic in all of Q, we say that zq is a removable singularity for /. 

Theorem 3.1 (Riemann’s theorem on removable singularities) 

Suppose that f is holomorphic in an open set n except possibly at a point 
Zq in ft. If f is bounded on Q — {zq}, then zq is a removable singularity. 


Proof. Since the problem is local we may consider a small disc D 
centered at zo and whose closure is contained in Q. Let C denote the 
boundary circle of that disc with the usual positive orientation. We 
shall prove that if z E D and z ♦ z 。、 then under the assumptions of the 
theorem we have 


⑷ 


f(Z)= 



/(C) 

c-^ 


d(. 


Since an application of Theorem 5.4 in the previous chapter proves that 
the right-hand side of equation (4) defines a holomorphic function on 
all of D that agrees with f(z) when z ^ Zo, this give us the desired 
extension. 

To prove formula (4) we fix 2 ： G -D with z ^ zq and use the familiar toy 
contour illustrated in Figure 4. 



Figure 4. The multiple keyhole contour in the proof of Riemann’s the¬ 
orem 


The multiple keyhole avoids the two points z and Zq. Letting the sides 
of the corridors get closer to each other, and finally overlap, in the limit 
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we get a cancellation: 


/(C) 

'C- Z 


d(^ + 


/(C) 

r - v 


d(^ + 


/(C) 


dC 


where 7 e and 7 ^ are small circles of radius e with negative orientation 
and centered at 2 ： and zo respectively. Copying the argument used in the 
proof of the Cauchy integral formula in Section 4 of Chapter 2, we find 
that 

/ p^-dC^-2nif(z). 

For the second integral, we use the assumption that / is bounded and 
that since e is small, ( stays away from 之 ， and therefore 


/(C) 


dC 


< Ce. 


Letting e tend to 0 proves our contention and concludes the proof of the 
extension formula (4). 

Surprisingly, we may deduce from Riemann’s theorem a characteriza¬ 
tion of poles in terms of the behavior of the function in a neighborhood 
of a singularity. 


Corollary 3.2 Suppose that f has an isolated singularity at the point 
zq. Then zo is a pole of f if and only if \ f(z)\ —^ 00 as z ^ zq. 

Proof. If zq is a pole, then we know that 1// has a zero at zo, and 
therefore \f(z)\ —^ 00 as 2 : —> zq- Conversely, suppose that this condition 
holds. Then 1// is bounded near zq, and in fact l/|/(z)| —> 0 as z —>■ Zq. 
Therefore, 1/f has a removable singularity at zq and must vanish there. 
This proves the converse, namely that zq is a pole. 

Isolated singularities belong to one of three categories: 

• Removable singularities (/ bounded near zo) 

• Pole singularities (\f(z)\ —> 00 as 2 : — 2 ： o) 

• Essential singularities. 

By default, any singularity that is not removable or a pole is defined 
to be an essential singularity. For example, the function e 1 / 2 dis- 
cussed at the very beginning of Section 1 has an essential singularity at 
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0. We already observed the wild behavior of this function near the ori¬ 
gin. Contrary to the controlled behavior of a holomorphic function near 
a removable singularity or a pole, it is typical for a holomorphic function 
to behave erratically near an essential singularity. The next theorem 
clarifies this. 

Theorem 3.3 (Casorati-Weierstrass) Suppose f is holomorphic in 
the punctured disc D r (zo) — {zo} and has an essential singularity at zq. 
Then, the image of D r (zo) — under f is dense in the complex plane. 


Proof. We argue by contradiction. Assume that the range of / is not 
dense, so that there exists w e C and 5 > 0 such that 

\f(z) — w\ > 5 for all z G D r (zo) — {zq}. 

We may therefore define a new function on D r (zo) — { 勿 } by 


"o)= 


i 

/(-) - w 


which is holomorphic on the punctured disc and bounded by 1/5. Hence 
g has a removable singularity at zq by Theorem 3.1. If p ( 之 o) # 0, then 
f(z) — w is holomorphic at zq, which contradicts the assumption that Zq 
is an essential singularity. In the case that g(zo) = 0, then f(z) — w has 
a pole at zq also contradicting the nature of the singularity at zq. The 
proof is complete. 

In fact, Picard proved a much stronger result. He showed that under 
the hypothesis of the above theorem, the function / takes on every com¬ 
plex value infinitely many times with at most one exception. Although 
we shall not give a proof of this remarkable result, a simpler version of 
it will follow from our study of entire functions in a later chapter. See 
Exercise 11 in Chapter 5. 


We now turn to functions with only isolated singularities that are 
poles. A function / on an open set f] is meromorphic if there exists a 
sequence of points { 之 0 , 之 1 ，之 2 , •..} that has no limit points in f], and such 
that 

(i) the function / is holomorphic in f] — { 卻 , 2 ： i, 勿 ， ...}, and 

(ii) f has poles at the points {^o, 之 1 ， ： 2 ,.. 


It is also useful to discuss functions that are meromorphic in the ex¬ 
tended complex plane. If a function is holomorphic for all large values of 
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z 、we can describe its behavior at infinity using the tripartite distinction 
we have used to classify singularities at finite values of z. Thus, if / is 
holomorphic for all large values of we consider F(z) = /(l/z), which 
is now holomorphic in a deleted neighborhood of the origin. We say that 
f has a pole at infinity if F has a pole at the origin. Similarly, we 
can speak of / having an essential singularity at infinity, or a re¬ 
movable singularity (hence holomorphic) at infinity in terms of the 
corresponding behavior of F at 0. A meromorphic function in the com¬ 
plex plane that is either holomorphic at infinity or has a pole at infinity 
is said to be meromorphic in the extended complex plane. 

At this stage we return to the principle mentioned at the beginning of 
the chapter. Here we can see it in its simplest form. 

Theorem 3.4 The meromorphic functions in the extended complex plane 
are the rational functions. 

Proof. Suppose that / is meromorphic in the extended plane. Then 
f(l/z) has either a pole or a removable singularity at 0, and in either 
case it must be holomorphic in a deleted neighborhood of the origin, so 
that the function / can have only finitely many poles in the plane, say 
at zi,..., z n . The idea is to subtract from / its principal parts at all its 
poles including the one at infinity. Near each pole G C we can write 

f(z ) 二 fk(z)+g k (z), 

where fk(z) is the principal part of / at Zk and gk is holomorphic in a 
(full) neighborhood of In particular, is a polynomial in l/(z — zj^). 
Similarly, we can write 

foo(z)+9oo(z), 

where c/oq is holomorphic in a neighborhood of the origin and foo is the 
principal part of f(l/z) at 0, that is, a polynomial in \ 丨 z. Finally, let 
foo{z) = foc{l/z). ^ 

We contend that the function H = f — — X^ =1 /fc is entire and 

bounded. Indeed, near the pole Zk we subtracted the principal part of / 
so that the function H has a removable singularity there. Also, H(l/z) 
is bounded for 2 ： near 0 since we subtracted the principal part of the 
pole at 00 . This proves our contention, and by Liouville’s theorem we 
conclude that H is constant. From the definition of we find that / is 
a rational function, as was to be shown. 

Note that as a consequence, a rational function is determined up to a 
multiplicative constant by prescribing the locations and multiplicities of 
its zeros and poles. 
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The Riemann sphere 

The extended complex plane, which consists of C and the point at infinity, 
has a convenient geometric interpretation, which we briefly discuss here. 

Consider the Euclidean space M 3 with coordinates (X, Y", Z) where the 
XF-plane is identified with C. We denote by § the sphere centered at 
(0,0,1/2) and of radius 1/2; this sphere is of unit diameter and lies on 
top of the origin of the complex plane as pictured in Figure 5. Also, we 
let N = (0,0,1) denote the north pole of the sphere. 



Given any point W = (X, Y", Z) on S different from the north pole, the 
line joining J\f and W intersects the Xy-plane in a single point which 
we denote by w = x iy] w is called the stereographic projection of 
W (see Figure 5). Conversely, given any point 初 in C, the line joining 
J\f and w = (x, y, 0) intersects the sphere at J\f and another point, which 
we call W. This geometric construction gives a bijective correspondence 
between points on the punctured sphere § — and the complex plane; 
it is described analytically by the formulas 


x = 


X 

l-z 


and 


V = 


Y 

1 - Z 


giving w in terms of W, and 


X = 


x 

x 2 -\- y 2 -\-l 


Y = 


y 

x 2 -\- y 2 -\-l 


and 


x 2 + y 2 
x 2 -\-y 2 -\-l 


giving W in terms of w. Intuitively, we have wrapped the complex plane 
onto the punctured sphere § — {A/*}. 
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As the point w goes to infinity in C (in the sense that \w\ —> oo) the 
corresponding point W on S comes arbitrarily close to J\f. This simple 
observation makes Af a natural candidate for the so-called “point at 
infinity.” Identifying infinity with the point J\f on S, we see that the 
extended complex plane can be visualized as the full two-dimensional 
sphere §; this is the Riemann sphere. Since this construction takes 
the unbounded set C into the compact set § by adding one point, the 
Riemann sphere is sometimes called the one-point compactification 
of C. 

An important consequence of this interpretation is the following: al¬ 
though the point at infinity required special attention when considered 
separately from C, it now finds itself on equal footing with all other points 
on S. In particular, a meromorphic function on the extended complex 
plane can be thought of as a map from § to itself, where the image of a 
pole is now a tractable point on S, namely J\f. For these reasons (and 
others) the Riemann sphere provides good geometrical insight into the 
structure of C as well as the theory of meromorphic functions. 

4 The argument principle and applications 

We anticipate our discussion of the logarithm (in Section 6 ) with a few 
comments. In general, the function log f(z) is “multiple-valued” because 
it cannot be defined unambiguously on the set where f(z) ^ 0. However 
it is to be defined, it must equal log \f{z) \ + iarg/(z), where log \f(z)\ 
is the usual real-variable logarithm of the positive quantity \f(z)\ (and 
hence is defined unambiguously), while arg/(z) is some determination 
of the argument (up to an additive integral multiple of 2tt). Note that in 
any case, the derivative of log / ⑷ is f / (z)/f(z) which is single-valued, 
and the integral 



can be interpreted as the change in the argument of / as 2 ： traverses 
the curve 7 . Moreover, assuming the curve is closed, this change of 
argument is determined entirely by the zeros and poles of / inside 7 . We 
now formulate this fact as a precise theorem. 

We begin with the observation that while the additivity formula 

log(/l/2) = log /l + log / 2 

fails in general (as we shall see below), the additivity can be restored 
to the corresponding derivatives. This is confirmed by the following 
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observation: 

(/i/ 2 y + 

fih — fih ~ h h' 

which generalizes to 

Uk=i fk & fk 

We apply this formula as follows. If / is holomorphic and has a zero 
of order n at zq, we can write 


/0) = 0 - z 0 ) n g(z), 

where g is holomorphic and nowhere vanishing in a neighborhood of zq, 
and therefore 


m 


n 


+ ^{z) 


where G(z) = g\z)/g{z). The conclusion is that if / has a zero of order 
n at zq, then /’// has a simple pole with residue n at zq. Observe 
that a similar fact also holds if / has a pole of order n at Zq, that is, if 


f(z) = (z — zo)~ n h(z). Then 

f(z) 

W) 


—n 
z — Zq 




Therefore, if / is meromorphic, the function f f /f will have simple poles 
at the zeros and poles of /, and the residue is simply the order of the 
zero of / or the negative of the order of the pole of /. As a result, an 
application of the residue formula gives the following theorem. 


Theorem 4.1 (Argument principle) Suppose f is meromorphic in 
an open set containing a circle C and its interior. If f has no poles 
and never vanishes on C, then 


2tti 



m 

/ ㈤ 


dz = (number of zeros of f inside C) minus 
(number of poles of f inside C), 


where the zeros and poles are counted with their multiplicities. 
Corollary 4.2 The above theorem holds for toy contours. 
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As an application of the argument principle, we shall prove three the¬ 
orems of interest in the general theory. The first, Rouche 5 s theorem, is 
in some sense a continuity statement. It says that a holomorphic func¬ 
tion can be perturbed slightly without changing the number of its zeros. 
Then, we prove the open mapping theorem, which states that holomor¬ 
phic functions map open sets to open sets, an important property that 
again shows the special nature of holomorphic functions. Finally, the 
maximum modulus principle is reminiscent of (and in fact implies) the 
same property for harmonic functions: a non-const ant holomorphic func¬ 
tion on an open set f] cannot attain its maximum in the interior of f]. 

Theorem 4.3 (Rouche’s theorem) Suppose that f and g are holo¬ 
morphic in an open set containing a circle C and its interior. If 


\f(z)\ > \g(z)\ for all z eC, 


then f and f g have the same number of zeros inside the circle C• 


Proof. For t G [0,1] define 


ft{z) = f(z) + tg(z) 


so that fo = f and fi = f g- Let n t denote the number of zeros of f t 
inside the circle counted with multiplicities, so that in particular, is 
an integer. The condition \f(z)\ > \g(z)\ for z E C clearly implies that 
ft has no zeros on the circle, and the argument principle implies 



To prove that nt is constant, it suffices to show that it is a continu¬ 
ous function of t. Then we could argue that if nt were not constant, 
the intermediate value theorem would guarantee the existence of some 
to G [0,1] with nt 0 not integral, contradicting the fact that nt G Z for 
all t. 

To prove the continuity of nt, we observe that fj.(z)/ft(z) is jointly 
continuous for t G [0,1] and z E ： C. This joint continuity follows from 
the fact that it holds for both the numerator and denominator, and our 
assumptions guarantee that ft(z) does not vanish on C. Hence nt is 
integer-valued and continuous, and it must be constant. We conclude 
that no = ni, which is Rouche 5 s theorem. 

We now come to an important geometric property of holomorphic func¬ 
tions that arises when we consider them as mappings (that is, mapping 
regions in the complex plane to the complex plane). 

A mapping is said to be open if it maps open sets to open sets. 
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Theorem 4.4 (Open mapping theorem) If f is holomorphic and non¬ 
constant in a region then f is open. 

Proof. Let wq belong to the image of /, say wq = /(zq). We must 
prove that all points w near wq also belong to the image of /. 

Define g(z) = f(z) — w and write 

g(z) = (f(z) - w 0 ) + (w 0 - w) 

^F(z) + G(z). 

Now choose 5 > 0 such that the disc \z — z G \ < (5 is contained in f] and 
f(z) wo on the circle \z — z 0 | = <5. We then select e > 0 so that we 
have \ f(z) — w^\ > e on the circle \z — zq\ = S. Now if \w — it ； o| < 6 we 
have |F( 2 ：)| > |G(z)| on the circle \z — z 0 \ = S, and by Rouche^ theorem 
we conclude that g = F G has a zero inside the circle since F has one. 

The next result pertains to the size of a holomorphic function. We 
shall refer to the maximum of a holomorphic function / in an open set 
as the maximum of its absolute value |/| in f]. 

Theorem 4.5 (Maximum modulus principle) If f is a non-constant 
holomorphic function in a region then f cannot attain a maximum in 


Proof. Suppose that / did attain a maximum at zq. Since / is 
holomorphic it is an open mapping, and therefore, if D C is a small disc 
centered at Zq, its image f(D) is open and contains f(zo). This proves 
that there are points in z E D such that \f(z)\ > |/( 2 ： 0 )|，a contradiction. 


Corollary 4.6 Suppose that Q is a region with compact closure Q. If f 
is holomorphic on Cl and continuous on Cl then 

sup 1 /( 2 ) I < sup 1/0)1. 
zen-n 

In fact, since f(z) is continuous on the compact set then \f(z)\ 
attains its maximum in but this cannot be in if / is non-constant. 
If / is constant, the conclusion is trivial. 

Remark. The hypothesis that Q is compact (that is, bounded) is es¬ 
sential for the conclusion. We give an example related to considerations 
that we will take up in Chapter 4. Let f] be the open first quadrant, 
bounded by the positive half-line a: > 0 and the corresponding imagi¬ 
nary line 2 / > 0. Consider F(z) = e~ lz . Then F is entire and clearly 
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continuous on 0 . Moreover |i 71 (:)| = 1 on the two boundary lines z — x 
and z = iy. However, F(z) is unbounded in f], since for example, we 
have F(z) = e r2 if z = ry/i = re Z7F//4 . 

5 Homotopies and simply connected domains 

The key to the general form of Cauchy’s theorem, as well as the analysis of 
multiple-valued functions, is to understand in what regions we can define 
the primitive of a given holomorphic function. Note the relevance to the 
study of the logarithm, which arises as a primitive ofl/z. The question is 
not just a local one, but is also global in nature. Its elucidation requires 
the notion of homotopy, and the resulting idea of simple-connectivity. 

Let 7o and 71 be two curves in an open set Cl with common end-points. 
So if 7o(t) and 71 (t) are two p ar amet r iz at ions defined on [a, 6], we have 


7o(a) = 71(a) = a and 70 ⑻ = 7i ⑼ =A. 


These two curves are said to be homotopic in if for each 0 < s $ 1 
there exists a curve 7 S C 0 , parametrized by 7 s (t) defined on [a, 6], such 
that for every s 


7 s (a) = a and ， s {b) = ( 3 , 


and for all t G [a, b] 


7 S ⑴ | s=0 = 7 o(t) and 7 s (t)| s =i = 71 ⑷. 


Moreover, 7 S ⑴ should be jointly continuous in 5 G [ 0 , 1 ] and t G [a, b]. 

Loosely speaking, two curves are homotopic if one curve can be de¬ 
formed into the other by a continuous transformation without ever leav¬ 
ing f] (Figure 6 ). 

Theorem 5.1 If f is holomorphic in then 



whenever the two curves 70 and 71 are homotopic in f]. 

Proof. The key to the proof lies in showing that if two curves are close 
to each other and have the same end-points, then the integrals over these 
curves are equal. Recall that by definition, the function F(s,t) = 7 s (t) is 
continuous on [ 0 , 1 ] x [a, 6]. In particular, since the image of F, which we 
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denote by is compact, there exists e > 0 such that every disc of radius 
3e centered at a point in the image of F is completely contained in Q. If 
not, for every ^ > 0, there exist points Z£ C K and W£ in the complement 
of Q such that \z£ — W(,\ < 1/i. By compactness of K, there exists a 
subsequence of {z^}, say {^ 4 }, that converges to a point z ^ K C By 
construction, we must have W£ k —^ 2 : as well, and since lies in the 
complement of Q which is closed, we must have 2 ： G as well. This is a 
contradiction. 

Having found an e with the desired property, we may, by the uniform 
continuity of F, select 5 so that 

sup | 7 S1 (t) - 7 s 2 (01 < 6 whenever \si - s 2 \ < 5. 

te[a,b] 

Fix si and s 2 with \si — s 2 \ < S. We then choose discs {J9 0 ,..., D n } of 
radius 2 e, and consecutive points { 2 : 0 ,..., z n+ i} on 7 Sl and {^ 0 ,..., ^ n +i} 
on 7 S2 such that the union of these discs covers both curves, and 

Zi,z i+ i,Wi,w i+1 e Di. 


The situation is illustrated in Figure 7. 

Also, we choose Zq = wq as the beginning end-point of the curves and 
z n+ \ = as the common end-point. On each disc Di, let F} denote a 
primitive of / (Theorem 2.1, Chapter 2). On the intersection of Di and 
Fi and are two primitives of the same function, so they must 
differ by a constant, say q. Therefore 


Pi+l{ z i+l) ~ — 
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Figure 7. Covering two nearby curves with discs 


hence 

( 5 ) — = Fi(zi + i) — 

This implies 

/ n n 

f = ^2[Fi(z i+1 ) - F^Zi)} - ( 叫 + 1 ) — 朽 ( 购 )] 

- s 2 2=0 1=0 
n 

—— Fi{wi + \) — (Fi(zi) — Fi{wi)) 

2=0 

— -Pn(^n+l) — Fn{ w n-\-l) ~ (-Pb(^o) — ^o{ w o)) 5 

because of the cancellations due to (5). Since 7 Sl and 7 S2 have the same 
beginning and end point, we have proved that 

f f. 

7s 2 




By subdividing [0,1] into subintervals [si, Sz+i] of length less than 5, we 
may go from 70 to 71 by finitely many applications of the above argument, 
and the theorem is proved. 
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A region f] in the complex plane is simply connected if any two pair 
of curves in f] with the same end-points are homotopic. 

Example 1. A disc D is simply connected. In fact, if 70 ⑷ and 71 (t) 
are two curves lying in D, we can define 7 s (t) by 


ls{t) = (1 - s)7o(i) +S7i(0- 


Note that if 0 < 5 < 1, then for each t, the point 7 s (t) is on the segment 
joining 70 (t) and 71 ⑷， and so is in D. The same argument works if D is 
replaced by a rectangle, or more generally by any open convex set. (See 
Exercise 21.) 

Example 2 . The slit plane f] = C — {(—oo, 0]} is simply connected. For 
a pair of curves 70 and 71 in f], we write Jj(t) = rj(t)e iej ^ (j = 0 , 1 ) 
with Tj{t) continuous and strictly positive, and 0j{t) continuous with 
•⑴ I < 7 r. Then, we can define 7 s (t) as r s (t)e tGs( ^ where 


r s (t) = (1 - s)r 0 (t) + sr x (t) and 0 s (t) = (1 - s)6 0 (t) + sO^t). 
We then have 7 s (t) G whenever 0 < s < 1. 


Example 3. With some effort one can show that the interior of a toy 
contour is simply connected. This requires that we divide the interior into 
several subregions. A general form of the argument is given in Exercise 4. 


Example 4. In contrast with the previous examples, the punctured 
plane C — {0} is not simply connected. Intuitively, consider two curves 
with the origin between them. It is impossible to continuously pass from 
one curve to the other without going over 0. A rigorous proof of this fact 
requires one further result, and will be given shortly. 

Theorem 5.2 Any holomorphic function in a simply connected domain 
has a primitive. 

Proof. Fix a point Zq in f] and define 



where the integral is taken over any curve in Q joining zq to 2 :. This 
definition is independent of the curve chosen, since Q is simply connected, 


6. The complex logarithm 


97 


and if 7 is another curve in f] joining zo and 2 :, we would have 

/ f(w) dw = / f (w) dw 
J -*y J ^ 

by Theorem 5.1. Now we can write 


F(z + ") — F(z) = / f(w)dw 

where rj is the line segment joining 2： and z + h. Arguing as in the proof 
of Theorem 2.1 in Chapter 2, we find that 

h—^0 h 


As a result, we obtain the following version of Cauchy’s theorem. 

Corollary 5.3 If f is holomorphic in the simply connected region 
then 



f(z) dz 二 Q 


for any closed curve 7 in f]. 


This is immediate from the existence of a primitive. 

The fact that the punctured plane is not simply connected now follows 
rigorously from the observation that the integral of 1/z over the unit 
circle is 27rz, and not 0 . 


6 The complex logarithm 

Suppose we wish to define the logarithm of a non-zero complex num¬ 
ber. If z = re 10 , and we want the logarithm to be the inverse to the 
exponential, then it is natural to set 

log 2： = logr + i6. 

Here and below, we use the convention that logr denotes the standard 1 
logarithm of the positive number r. The trouble with the above defini¬ 
tion is that 6 is unique only up to an integer multiple of 2n. However, 


x By the standard logarithm, we mean the natural logarithm of positive numbers that 
appears in elementary calculus. 
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for given 之 we can fix a choice of 9 : and if 之 varies only a little, this 
determines the corresponding choice of 9 uniquely (assuming we require 
that 6 varies continuously with z). Thus “locally” we can give an unam¬ 
biguous definition of the logarithm, but this will not work “globally.” For 
example, if 2 : starts at 1 , and then winds around the origin and returns 
to 1 , the logarithm does not return to its original value, but rather differs 
by an integer multiple of 27ri, and therefore is not “single-valued.” To 
make sense of the logarithm as a single-valued function, we must restrict 
the set on which we define it. This is the so-called choice of a branch 
or sheet of the logarithm. 

Our discussion of simply connected domains given above leads to a 
natural global definition of a branch of the logarithm function. 

Theorem 6.1 Suppose that is simply connected with 1 G and 0 ^ 
Then in ft there is a branch of the logarithm F(z) = log^(z) so that 

(i) F is holomorphic in 

(ii) e F ^ = z for all z E Q, 

(iii) F(r) = logr whenever r is a real number and near 1. 

In other words, each branch log^(z) is an extension of the standard 
logarithm defined for positive numbers. 

Proof. We shall construct F as a primitive of the function 1/z. Since 
0 ^ li, the function f(z) = 1/z is holomorphic in We define 

lo Sn( 2 ) = F i z ) = [ / ㈣ —， 

where 7 is any curve in f] connecting 1 to 之 . Since f] is simply connected, 
this definition does not depend on the path chosen. Arguing as in the 
proof of Theorem 5.2, we find that F is holomorphic and F\z) = \/z 
for all z G f2. This proves (i). To prove (ii), it suffices to show that 
Ze -F(z) _ 2 For that, we differentiate the left-hand side, obtaining 

夺 (ze _F ( 2 )) = e~ F ^ — zF\z)e~ F ^ = (1 — zF\z))e~ F ^ = 0 . 

Since is connected we conclude, by Corollary 3.4 in Chapter 1, that 
Ze -F(z) - g constant. Evaluating this expression at z = 1, and noting that 
F(l) = 0 , we find that this constant must be 1 . 

Finally, if r is real and close to 1 we can choose as a path from 1 to r 
a line segment on the real axis so that 

Fir) = f — = logr, 
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by the usual formula for the standard logarithm. This completes the 
proof of the theorem. 

For example, in the slit plane f] = C — {(—oo, 0]} we have the princi¬ 
pal branch of the logarithm 

log 2 ： = log r -\-i0 

where 2 ： = re ld with \9\ < tt. (Here we drop the subscript f], and write 
simply log z.) To prove this, we use the path of integration 7 shown in 
Figure 8 . 



Figure 8 . Path of integration for the principal branch of the logarithm 


If z = re ie with \6\ < 7 r, the path consists of the line segment from 1 
to r and the arc 77 from r to 2 ；. Then 


log 2 ： 


'1 


dx 


x 


,r n 


dw 

w 


f*0 nf 

/ ire xz . 

log r + / ——— at 


'0 


re 


it 


=logr + id. 

An important observation is that in general 

log(2 ： i2 ： 2) 7^ log Zi +logZ 2 . 

For example, if Z\ = e 27rz / 3 = 2 : 2 , then for the principal branch of the 
logarithm, we have 

2ni 

了 ， 


log Zi = log 22 
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and since Z 1 Z 2 = e _27rl//3 we have 


2ttz 

了 


\og{ziZ 2 ) 7 ^ log Zi + log 2 ： 2 . 


Finally, for the principal branch of the logarithm the following Taylor 
expansion holds: 


z 2 z 3 


(6) log(l + z ) = z~y + y ~ 




- 




n 


for | 之 | < 1. 


Indeed, the derivative of both sides equals 1/(1 + z), so that they differ 
by a constant. Since both sides are equal to 0 at z = 0 this constant 
must be 0, and we have proved the desired Taylor expansion. 

Having defined a logarithm on a simply connected domain, we can 
now define the powers z a for any a G C. If f] is simply connected with 
1 G and 0 ^ f], we choose the branch of the logarithm with log 1 = 0 
as above, and define 

― gQ ： log 2 ： 


Note that l a = 1, and that if a = 1/n, then 

n 

(z 1/n ) n = JJe" logz = = e" log2 = e log2 = 2 . 

k=l 

We know that every non-zero complex number w can be written as 
w = e z . A generalization of this fact is given in the next theorem, which 
discusses the existence of log f(z) whenever / does not vanish. 

Theorem 6.2 If f is a nowhere vanishing holomorphic function in a 
simply connected region fi ，then there exists a holomorphic function g on 
f] such that 

f(z) 二 〆 气 

The function g(z) in the theorem can be denoted by log f(z)^ and deter¬ 
mines a “branch” of that logarithm. 

Proof. Fix a point zq in f], and define a function 

9{z) ^S 1 W) dw + C0 ' 
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where 7 is any path in f] connecting zq to 2 :, and Co is a complex number 
so that e c ° = f(zo). This definition is independent of the path 7 since 
is simply connected. Arguing as in the proof of Theorem 2.1, Chapter 2, 
we find that g is holomorphic with 


g'{z )= 


f'(z) 

1W 


and a simple calculation gives 

去 (/( 咖 - 咖 )) 二 0 , 


so that f(z)e~ 9 ^ is constant. Evaluating this expression at z 0 we find 
f(zo)e~ c ° = 1 ， so that f(z) = e 9 ^ for all 2 : G O, and the proof is com¬ 
plete. 


7 Fourier series and harmonic functions 

In Chapter 4 we shall describe some interesting connections between com¬ 
plex function theory and Fourier analysis on the real line. The motivation 
for this study comes in part from the simple and direct relation between 
Fourier series on the circle and power series expansions of holomorphic 
functions in the disc, which we now investigate. 

Suppose that / is holomorphic in a disc Dn(zo), so that / has a power 
series expansion 

00 

f(z) ^^2a n (z - z 0 ) n 

n=0 


that converges in that disc. 


Theorem 7.1 The coefficients of the power series expansion of f are 
given by 


= 


27rr n 



f(z 0 + re ie )e~ ine 


d9 


for all n > 0 and 0 < r < R. Moreover, 


1 r 27r 


whenever n < 0. 
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Proof. Since /( n ) ( 之 o) = a n n!，the Cauchy integral formula gives 

〜 1 f /(C) „ 

an ^^il(c-z 0 )^ dC 

where 7 is a circle of radius 0 < r < R centered at Zq and with the positive 
orientation. Choosing C = 之 0 + reie f° r the parametrization of this circle, 
we find that for n > 0 


r 2n 


f(zo + re 


i6\ 


rie l6 d6 


2m J 0 (zq + re lQ — 2 ： o) n+1 


r» 27 r 


2nr n 


27rr n 


f(z 0 + re ie )e-^ n+1)d e ie d6 




/(z 0 + re lU )e~ inU d9. 


Finally, even when n < 0, our calculation shows that we still have the 
identity 


r»27r 


27rr n 


f(z 0 + re ie )e~ ine d9 


/(C) 


27r * J-y (C - Z 0 ) H+1 


dC. 


Since —n > 0, the function /(C)(C — Zo)~ n ~ 1 is holomorphic in the disc, 
and by Cauchy’s theorem the last integral vanishes. 

The interpretation of this theorem is as follows. Consider f(zo + re l6 ) 
as the restriction to the circle of a holomorphic function on the closure 
of a disc centered at zq with radius r. Then its Fourier coefficients 
vanish if n < 0, while those for n > 0 are equal (up to a factor of r n ) 
to coefficients of the power series of the holomorphic function /. The 
property of the vanishing of the Fourier coefficients for n < 0 reveals 
another special characteristic of holomorphic functions (and in particular 
their restrictions to any circle). 

Next, since ao = /( 之 o), we obtain the following corollary. 

Corollary 7.2 (Mean-value property) If f is holomorphic in a disc 
D r (z q ), then 


1 

f(z 0 ) = — y f(z 0 + re 10 ) dd, for any 0 < r < R. 

Taking the real parts of both sides, we obtain the following conse¬ 
quence. 
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Corollary 7.3 If f is holomorphic in a disc D ， and u = Re(/) ; 
then 



Recall that u is harmonic whenever / is holomorphic, and in fact, the 
above corollary is a property enjoyed by every harmonic function in the 
disc Dr(zq). This follows from Exercise 12 in Chapter 2, which shows 
that every harmonic function in a disc is the real part of a holomorphic 
function in that disc. 

8 Exercises 

1. Using Euler’s formula 


Sin 7TZ = 


show that the complex zeros of sirniz are exactly at the integers, and that they 
are each of order 1. 

Calculate the residue of 1/ sin tv z at z = n E Z. 

2. Evaluate the integral 



Where are the poles of 1/(1 + z A )l 


3. Show that 



4. Show that 



5. Use contour integration to show that 



for all ^ real. 
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6. Show that 



dx — 1.3.5 … （ 2n - 1) 
(1 + x 2 ) 时 1 = 2 • 4 • 6 … (2n) 


7 T. 


7. Prove that 



dO 2na 

(a + cos 0) 2 (a 2 — I) 3 〆 2 ’ 


whenever a > 1. 


8. Prove that 



d6 27r 

a + bcosO y/a 2 - b 2 


if a > \b\ and a, 6 G M. 


9. Show that 



log (sin nx) dx = 


-log 2. 


[Hint: Use the contour shown in Figure 9.] 


0 1 

Figure 9. Contour in Exercise 9 


10. Show that if a > 0, then 



logrr 
c 2 + a 2 


dx - 


2a 


log a. 


[Hint: Use the contour in Figure 10.] 


11. Show that if |a| < 1, then 



= 0 . 
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Figure 10. Contour in Exercise 10 


Then, prove that the above result remains true if we assume only that |a| < 1. 


12. Suppose u is not an integer. Prove that 

oo 

V —-— = 

(u + n) 2 


(sin nu) 2 


by integrating 


/(»)= 


7T COt 7VZ 

(u + z) 2 


over the circle \z\ = Rn = iV + l/2 (N integral, N > adding the residues of 
f inside the circle, and letting N tend to infinity. 

Note. Two other derivations of this identity, using Fourier series, were given in 
Book I. 


13. Suppose f(z) is holomorphic in a punctured disc D r (zo) — {zo}. Suppose also 
that 


\f(z)\<A\z-z 0 \- 1+e 


for some e > 0, and all 2 ： near zq. Show that the singularity of / at zq is removable. 


14. Prove that all entire functions that are also injective take the form 
f(z) = az -\-b with a, 6 G C, and a ^ 0. 

[Hint: Apply the Casorati-Weierstrass theorem to f{l/z).\ 

15. Use the Cauchy inequalities or the maximum modulus principle to solve the 
following problems: 


(a) Prove that if / is an entire function that satisfies 

sup \f(z)\<AR k +B 

\z\=R 
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for all R > 0, and for some integer k > 0 and some constants A,B>0, then 
/ is a polynomial of degree < k. 

(b) Show that if / is holomorphic in the unit disc, is bounded, and converges 

uniformly to zero in the sector 0 < arg ^ ^ as |z| —>• 1, then / = 0. 

(c) Let wi,, w-n be points on the unit circle in the complex plane. Prove that 
there exists a point 2 ： on the unit circle such that the product of the distances 
from z to the points Wj, 1 < ^ < n, is at least 1. Conclude that there exists 
a point w on the unit circle such that the product of the distances from w 
to the points Wj, 1 < j < n, is exactly equal to 1. 

(d) Show that if the real part of an entire function / is bounded, then / is 
constant. 

16. Suppose / and g are holomorphic in a region containing the disc \z\ < 1. 

Suppose that / has a simple zero a,t z = 0 and vanishes nowhere else in | 2 ：| < 1. 

Let 


fe(z) = f(z)-\-eg(z). 


Show that if e is sufficiently small, then 

(a) f e (z) has a unique zero in | 2 ：| < 1, and 

(b) if is this zero, the mapping e h is continuous. 

IT. Let / be non-constant and holomorphic in an open set containing the closed 


unit disc. 


(a) Show that if \f(z)\ = 1 whenever \z\ = 1, then the image of / contains the 
unit disc. [Hint: One must show that f(z) = wo has a root for every wo G D. 
To do this, it suffices to show that f(z) = 0 has a root (why?). Use the 
maximum modulus principle to conclude.] 

(b) If |/(z)| > 1 whenever | 之 | = 1 and there exists a point 2 ：o G D such that 
\f(zo)\ < 1, then the image of / contains the unit disc. 


18. Give another proof of the Cauchy integral formula 



using homotopy of curves. 

[Hint: Deform the circle C to a small circle centered at 2 :, and note that the 
quotient (/(C) — /(^))/(C — z ) i s bounded.] 

19. Prove the maximum principle for harmonic functions, that is: 
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(a) If ti is a non-constant real-valued harmonic function in a region Q,, then u 
cannot attain a maximum (or a minimum) in Q. 

(b) Suppose that Q is a region with compact closure Q. If u is harmonic in Q 
and continuous in Q, then 


sup \u(z)\ < sup |ti(z)|. 


[Hint: To prove the first part, assume that u attains a local maximum at zq. Let / 
be holomorphic near zo with u = Re(/), and show that / is not open. The second 
part follows directly from the first.] 

20. This exercise shows how the mean square convergence dominates the uniform 
convergence of analytic functions. If U is an open subset of C we use the notation 



for the mean square norm, and 


||/IU~(y) = sup 1 / ⑷ I 


for the sup norm. 

(a) If / is holomorphic in a neighborhood of the disc D r (zo), show that for any 
0 < s < r there exists a constant C > 0 (which depends on s and r) such 
that 


\\f\\L°°(D s (zo)) < C||/||_L 2 (_D r (> 0 )) • 


(b) Prove that if {/ n } is a Cauchy sequence of holomorphic functions in the 
mean square norm || . \\l 2 (u)^ then the sequence {/ n } converges uniformly 
on every compact subset of U to a holomorphic function. 

[Hint: Use the mean-value property.] 

21. Certain sets have geometric properties that guarantee they are simply con¬ 
nected. 

(a) An open set D C C is convex if for any two points in the straight line 
segment between them is contained in Q. Prove that a convex open set is 
simply connected. 

(b) More generally, an open set Q C C is star-shaped if there exists a point 
zo G such that for any z E Q,, the straight line segment between 2 ： and zo 
is contained in Q. Prove that a star-shaped open set is simply connected. 
Conclude that the slit plane C — {(—oo, 0]} (and more generally any sector, 
convex or not) is simply connected. 
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(c) What are other examples of open sets that are simply connected? 


22. Show that there is no holomorphic function / in the unit disc D that extends 
continuously to such that f(z) = 1/z for 2 ; G 


9 Problems 


1.* Consider a holomorphic map on the unit disc / : D —> C which satisfies 
/(0) = 0. By the open mapping theorem, the image /(D) contains a small disc 
centered at the origin. We then ask: does there exist r > 0 such that for all 
/ : D —> C with /(0) = 0, we have D r {0) C /(D)? 

(a) Show that with no further restrictions on /, no such r exists. It suffices to 
find a sequence of functions {/ n } holomorphic in D such that 1/n ^ /(D). 
Compute fn (0), and discuss. 

(b) Assume in addition that / also satisfies f\0) = 1. Show that despite this 
new assumption, there exists no r > 0 satisfying the desired condition. 

[Hint: Try f e (z) = e(e z/e - 1).] 

The Koebe-Bieberbach theorem states that if in addition to /(0) = 0 and 
/’ （ 0) = 1 we also assume that / is injective, then such an r exists and the best 
possible value is r = 1/4. 


(c) As a first step, show that if h(z) = - + co + c\z + C 2 Z 2 + is analytic and 
injective for 0 < \z\ < 1, then Y^ =1 n\c n \ 2 < 1. 

[Hint: Calculate the area of the complement of h(D p (0) — {0}) where 
0 < p < 1, and let p ^ 1.] 

(d) If f(z) = z a 2 Z 2 H — • satisfies the hypotheses of the theorem, show that 
there exists another function q satisfying the hypotheses of the theorem such 
that g 2 (z) = f(z 2 ). 

[Hint: f(z)/z is nowhere vanishing so there exists *0 such that 

☆ 2 (z) = f(z)/z and 嗲 (0) = 1. Check that g(z) = zip(z 2 ) is injective.] 

(e) With the notation of the previous part, show that \a 2 \ < 2, and that equality 
holds if and only if 

/(«)= (1 _、2 for some 9 e R. 

[Hint: What is the power series expansion of l/g{z)l Use part (c).] 

(f) If h(z) = I + Co + ci 2 + C 2 Z 2 + is injective on D and avoids the values 
zi and Z 2 , show that |zi — Z 2 I < 4. 

[Hint: Look at the second coefficient in the power series expansion of 
l/(h{z) - Zj).} 
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(g) Complete the proof of the theorem. [Hint: If / avoids w, then 1// avoids 0 
and l/w.] 


2. Let ti be a harmonic function in the unit disc that is continuous on its closure. 
Deduce Poisson’s integral formula 

U{Zo) = ^r de for 1 如 1 < 1 

from the special case = 0 (the mean value theorem). Show that if zo = re Zip , 
then 

1 — 1 卜 — _ 1 一广 _ _ p ⑺— X 

\e i9 — zo\ 2 1 — 2r cos(9 — (p) r 2 r 中 ’ 

and we recover the expression for the Poisson kernel derived in the exercises of the 
previous chapter. 

[Hint: Set uq(z) = u(T(z)) where 


T{z) = 


Zp — Z 
1 — ZQZ 


Prove that uo is harmonic. Then apply the mean value theorem to uo, and make 
a change of variables in the integral.] 


3. If f(z) is holomorphic in the deleted neighborhood {0 < 卜 一 2 。| < r} and has 
a pole of order k at zq, then we can write 


m 


a~k 


(z - Zo) k 


+ • 


(z - Zo) 




where g is holomorphic in the disc {|z — 2 ： o| < r}. There is a generalization of this 
expansion that holds even if zq is an essential singularity. This is a special case of 
the Laurent series expansion, which is valid in an even more general setting. 

Let / be holomorphic in a region containing the annulus {z : r± < |:— 么 。| < r 2 } 
where 0 < ri < 7 * 2 . Then, 


/( 和 E a n (z - zo) n 


where the series converges absolutely in the interior of the annulus. To prove this, 
it suffices to write 



货成 - 士 . 


c ri 


/(C) 

c-^ 


dC 


when n < 1 2 ： — 2 ：o I < 7 * 2 , and argue as in the proof of Theorem 4.4, Chapter 2. 
Here C ri and C r2 are the circles bounding the annulus. 
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4.* Suppose H is a bounded region. Let L be a (two-way infinite) line that intersects 
Assume that Q fl L is an interval I. Choosing an orientation for L, we can define 
fli and Q r to be the subregions of Q lying strictly to the left or right of L, with 
Q = Q，i U I U fl r a disjoint union. If Qi and Q r are simply connected, then is 
simply connected. 


5.* Let 



where h is continuous and supported in [—M, M\. 

(a) Prove that the function g is holomorphic in C — [—M, M], and vanishes 
at infinity, that is, lim^i^oo \g{z)\ = 0. Moreover, the “jump” of p across 
[—M, M] is /i, that is, 


h{x)= 


g(x + ie) — g(x — ie). 


lim 

e — ^0,e>0 


[Hint: Express the difference g(x -\- ie) — g(x — ie) in terms of a convolution 
of h with the Poisson kernel.] 

(b) If h satisfies a mild smoothness condition, for instance a Holder condition 
with exponent a, that is, \h{x) — h(y)\ < C\x — y| a for some C > 0 and all 
x,y G [—M, M], then g(x + ie) and g(x — ie) converge uniformly to functions 
g+(x) and g~{x) as e ^ 0. Then, g can be characterized as the unique 
holomorphic function that satisfies: 

(i) g is holomorphic outside [—M, M], 

(ii) g vanishes at infinity, 

(iii) g{x + ie) and g{x — ie) converge uniformly as e ^ 0 to functions g+(x) 


and g~(x) with 


9 +{x) - g-{x) = h{x). 


[Hint: If G is another function satisfying these conditions, g — G is entire.] 




Fourier Transform 


Raymond Edward Alan Christopher Paley, Fellow of 
Trinity College, Cambridge, and International Research 
Fellow at the Massachusetts Institute of Technology 
and at Harvard University, was killed by an avalanche 
on April 7, 1933, while skiing in the vicinity of Banff, 
Alberta. Although only twenty-six years of age, he 
was already recognized as the ablest of the group of 
young English mathematicians who have been inspired 
by the genius of G. H. Hardy and J. E. Littlewood. In 
a group notable for its brilliant technique, no one had 
developed this technique to a higher degree than Pa- 
ley. Nevertheless he should not be thought of primar¬ 
ily as a technician, for with this ability he combined 
creative power of the first order. As he himself was 
wont to say, technique without “rugger tactics” will 
not get one far, and these rugger tactics he practiced 
to a degree that was characteristic of his forthright 
and vigorous nature. 

Possessed of an extraordinary capacity for mak¬ 
ing friends and for scientific collaboration, Paley be¬ 
lieved that the inspiration of continual interchange of 
ideas stimulates each collaborator to accomplish more 
than he would alone. Only the exceptional man works 
well with a partner, but Paley had collaborated suc¬ 
cessfully with many, including Littlewood, Polya, Zyg- 
mund, and Wiener. 


N. Wiener, 1933 


If / is a function on M that satisfies appropriate regularity and decay 
conditions, then its Fourier transform is defined by 



and its counterpart, the Fourier inversion formula, holds 



The Fourier transform (including its d-dimensional variants), plays a ba- 
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sic role in analysis, as the reader of Book I is aware. Here we want to illus¬ 
trate the intimate and fruitful connection between the one-dimensional 
theory of the Fourier transform and complex analysis. The main theme 
(stated somewhat imprecisely) is as follows: for a function / initially 
defined on the real line, the possibility of extending it to a holomorphic 
function is closely related to the very rapid (for example, exponential) 
decay at infinity of its Fourier transform /. We elaborate on this theme 
in two stages. 

First, we assume that / can be analytically continued in a horizontal 
strip containing the real axis, and has “moderate decrease” at infinity, 1 so 
that the integral defining the Fourier transform / converges. As a result, 
we conclude that / decreases exponentially at infinity; it also follows 
directly that the Fourier inversion formula holds. Moreover one can 
easily obtain from these considerations the Poisson summation formula 
SnGZ /( n ) = SnGZ /( n )- Incidentally, all these theorems are elegant 
consequences of contour integration. 

At a second stage, we take as our starting point the validity of the 
Fourier inversion formula, which holds if we assume that both / and / are 
of moderate decrease, without making any assumptions on the analyticity 
of /. We then ask a simple but natural question: What are the conditions 
on / so that its Fourier transform is supported in a bounded interval, 
say [—M, M]? This is a basic problem that, as one notices, can be stated 
without any reference to notions of complex analysis. However, it can 
be resolved only in terms of the holomorphic properties of the function 
/. The condition, given by the Paley-Wiener theorem, is that there be 
a holomorphic extension of / to C that satisfies the growth condition 

\f(z)\ < Ae 2 ^ M \ z \ for some constant A > 0. 

Functions satisfying this condition are said to be of exponential type. 

Observe that the condition that / vanishes outside a compact set can 
be viewed as an extreme version of a decay property at infinity, and so 
the above theorem clearly falls within the context of the theme indicated 
above. 

In all these matters a decisive technique will consist in shifting the 
contour of integration, that is the real line, within the boundaries of 
a horizontal strip. This will take advantage of the special behavior of 
e ~ 2 ^%zi w hen z has a non-zero imaginary part. Indeed, when 2 ： is real this 
exponential remains bounded and oscillates, while if Im(^) ^ 0, it will 


1 We say that a function f is of moderate decrease if f is continuous and there 
exists A > 0 so that |/(a;)| < A /(1 + x 2 ) for all x E M. A more restrictive condition is 
that / G 5, the Schwartz space of testing functions, which also implies that / belongs to 
S. See Book I for more details. 
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have exponential decay or exponential increase, depending on whether 
the product ^lm(z) is negative or positive. 


1 The class ^ 

The weakest decay condition imposed on functions in our study of the 
Fourier transform in Book I was that of moderate decrease. There, we 
proved the Fourier inversion and Poisson summation formulas under the 
hypothesis that / and / satisfy 

\f( x )\ ^ yt^ and I 加 I s ife 

for some positive constants A, A! and all G IR. We were led to consider 
this class of functions because of various examples such as the Poisson 
kernel 

Py( x ) = ~ 2 2 

7r y 2 x 2 

for ? / > 0, which played a fundamental role in the solution of the Dirichlet 
problem for the steady-state heat equation in the upper half-plane. There 
we had P y {^) = e~ 2lTy ^. 

In the present context, we introduce a class of functions particularly 
suited to the goal we have set: proving theorems about the Fourier trans¬ 
form using complex analysis. Moreover, this class will be large enough 
to contain many of the important applications we have in mind. 

For each a > 0 we denote by the class of all functions / that satisfy 
the following two conditions: 

(i) The function / is holomorphic in the horizontal strip 
S a = {z E C : |Im(^)| < a}. 


(ii) There exists a constant A > 0 such that 




for all x G M and \y\ < a. 


In other words, 5a consists of those holomorphic functions on S a that 
are of moderate decay on each horizontal line Im( 2 ：) = uniformly in 
—a < y < a. For example, f(z) = e~ nz belongs to 5a for all a. Also, 
the function 


/ ㈤ 


7T 


C 2 + . 
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which has simple poles at 2 ： = 士 ci, belongs to Ja for all 0 < a < c. 

Another example is provided by f(z) = 1/ cosh7rz, which belongs to 
whenever \a\ < 1/2. This function, as well as one of its fundamental 
properties, was already discussed in Example 3, Section 2.1 of Chapter 3. 

Note also that a simple application of the Cauchy integral formulas 
shows that if / G then for every n, the n th derivative of / belongs to 
5*6 for all b with 0 < 6 < a (Exercise 2). 

Finally, we denote by 5 the class of all functions that belong to Ja for 
some a. 

Remark. The condition of moderate decrease can be weakened some¬ 
what by replacing the order of decrease of A/(l + x 2 ) by A/ (1 + |x| 1+e ) 
for any e > 0. As the reader will observe, many of the results below 
remain unchanged with this less restrictive condition. 

2 Action of the Fourier transform on ^ 

Here we prove three theorems, including the Fourier inversion and Pois¬ 
son summation formulas, for functions in The idea behind all three 
proofs is the same: contour integration. Thus the approach used will be 
different from that of the corresponding results in Book I. 

Theorem 2.1 If f belongs to the class for some a > 0, then 
1/(01 < Be~ 2nb ^ for any 0 < b < a. 

Proof. Recall that /(^) = f(x)e~ 27Tlx ^ dx. The case 6 = 0 simply 
says that / is bounded, which follows at once from the integral defining 
/, the assumption that / is of moderate decrease, and the fact that the 
exponential is bounded by 1. 

Now suppose 0 < b < a and assume first that ^ > 0. The main step 
consists of shifting the contour of integration, that is the real line, down 
by b. More precisely, consider the contour in Figure 1 as well as the 
function g(z) = f(z)e~ 27rlz ^. 

We claim that as R tends to infinity, the integrals of g over the two 
vertical sides converge to zero. For example, the integral over the vertical 
segment on the left can be estimated by 


/ g(z) dz 
J—R—ib 





0, 

= 0(1/R 2 ). 
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i\ 



—R — ib 


R — ib 


Figure 1. The contour in the proof of Theorem 2.1 when ^ > 0 


A similar estimate for the other side proves our claim. Therefore, by 
Cauchy 5 s theorem applied to the large rectangle, we find in the limit as 
R tends to infinity that 

⑴ Ko = [°° f(x - ib)e- 2 ^ x ~ ib ^dx, 

J —OO 

which leads to the estimate 

\f(0\ < [°° -r^e~ 2 ^dx< Be~ 2 ^, 

J-OO 丄十 X 

where B is a suitable constant. A similar argument for ^ < 0, but this 
time shifting the real line up by 6, allows us to finish the proof of the 
theorem. 

This result says that whenever / G J, then / has rapid decay at infinity. 
We remark that the further we can extend / (that is, the larger a), then 
the larger we can choose 6, hence the better the decay. We shall return 
to this circle of ideas in Section 3, where we describe those / for which 
f has the ultimate decay condition: compact support. 

Since / decreases rapidly on R, the integral in the Fourier inversion 
formula makes sense, and we now turn to the complex analytic proof of 
this identity. 

Theorem 2.2 // / G then the Fourier inversion holds, namely 

广 OO 

fix) = / /(^) e 2 ™« di for all a; e M. 
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Besides contour integration, the proof of the theorem requires a simple 
identity, which we isolate. 

Lemma 2.3 If A is positive and B is real, then = 

A-\-iB * 

Proof. Since A > 0 and B G M, we have and the 

integral converges. By definition 


lim 广 e ，碑 

i^oo 人 


处 . 


However, 


- {A+iB)e, 




A-\-iB 


R 


which tends to l/(A-\- iB) as R tends to infinity. 


We can now prove the inversion theorem. Once again, the sign of ^ 
matters, so we begin by writing 






/ ⑹ e 2 喊炎 + 





For the second integral we argue as follows. Say / G Ja and choose 
0 < 6 < a. Arguing as the proof of Theorem 2.1, or simply using equa¬ 
tion (1), we get 


m 



f(u- ib)e~ 2ni{u ~ ib)i du, 


so that with an application of the lemma and the convergence of the 
integration in we find 
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where L\ denotes the line {u — ib : u G M} traversed from left to right. 
(In other words, L\ is the real line shifted down by b.) For the integral 
when ^ < 0 , a similar calculation gives 

/ 0 / ⑹ e 2 一 屯=— 一 一 / 

7-00 Jl 2 C-X 

where L 2 is the real line shifted up by 6 , with orientation from left to 
right. Now given x G IR, consider the contour in Figure 2. 


—R + ib 

1R 

R + ib 






0 

X 


-R 

—ib 

R- 

-ib 


Figure 2. The contour 7 丑 in the proof of Theorem 2.2 


The function /(C)/(C — x ) has a simple pole at x with residue f(x), so 
the residue formula gives 


/ ㈤ 


i f /(C) 

2 丌 i J 7R C-x 


d(. 


Letting R tend to infinity, one checks easily that the integral over the 
vertical sides goes to 0 and therefore, combining with the previous results, 
we get 

勝 - m 〜 d<： 


- X 


。 hOeS 吡 + 
f(£,)e 2 ^d^ 


1 / ⑹ e 27 ^ 成 


OO 
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and the theorem is proved. 

The last of our three theorems is the Poisson summation formula. 
Theorem 2.4 If f E then 

Proof. Say / G 5a and choose some b satisfying 0 < b < a. The func¬ 
tion l/(e 2nl,z — 1) has simple poles with residue 1/(2tti) at the integers. 
Thus f(z)/(e 27Tlz — 1) has simple poles at the integers n, with residues 
/(n) /2ni. We may therefore apply the residue formula to the contour 
7 ^v in Figure 3 where N is an integer. 


-N - 


In 

N ^ ib 





-N-l 

—N -1 0 

1 N 

AT + 1 

-N- 


N+\-ib 


Figure 3. The contour in the proof of Theorem 2.4 


This yields 


Z] ⑽ =f ! dz - 

|n|<iV J ^ N 


Letting N tend to infinity, and recalling that / has moderate decrease, 
we see that the sum converges to /(n), and also that the integral 

over the vertical segments goes to 0. Therefore, in the limit we get 


( 2 ) 




nGZ 


/la 


m 

^2niz _ 1 


dz — 



/ ⑷ 

^2tviz _ 1 


dz 
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where L\ and L 2 are the real line shifted down and up by 6, respectively. 
Now we use the fact that if \w\ > 1, then 


1 


w 


w~ x 


to see that on L\ (where \e 27Tlz \ > 1) we have 


q2ttiz _ 1 


27TZ2： 〉: ^—2ninz 


Also if \w\ < 1, then 


w 


00 


w 


so that on L 2 


o2tt iz 




、 27rinz 


Substituting these observations in (2) we find that 


二 I f(z) i e~ 
nGZ J Ll \ 


E< 


2ttiz \ ^—27rinz 


dz+ f(z) 

^ L 2 \n=0 


00 \ 

e 2ninz dz 



f{z)e~ 2 ^ n+ ^ z dz + / f(z)e 2ninz dz 

0 17 Ll n=0 j L2 

00 疒 00 

f(x)e- 2wi ^ n+1 ^dx + y^ / f(x)e 2winx dz 



f/(n+ 1) + f/(—n) 

n=0 n=0 


nGZ 


where we have shifted L\ and L 2 back to the real line according to 
equation (1) and its analogue for the shift down. 

The Poisson summation formula has many far-reaching consequences, 
and we close this section by deriving several interesting identities that 
are of importance for later applications. 
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First, we recall the calculation in Example 1, Chapter 2, which showed 
that the function e~ nx was its own Fourier transform: 

oo 

e~ nx2 e~ 27rix ^ dx = e~^ 2 . 

-OO 

For fixed values of t > 0 and a GIR, the change of variables 
x i—^ t 1//2 (a: + a) in the above integral shows that the Fourier transform 
of the function f{x) = e 一 7rt ( x + a ) 2 is /⑹ = t~ 1 ^ 2 e~ 7r ^ 2 ^ t e 27Via ^. Applying 
the Poisson summation formula to the pair / and / (which belong to 5) 
provides the following relation: 

oo oo 

7rt(n+a) 2 — 〉: ^.—l/2^—7rn 2 /t^2nina 

n=—oo n=—oo 


⑶ E 



This identity has noteworthy consequences. For instance, the special case 
a = 0 is the transformation law for a version of the “theta function ’，： 
if we define ^ for t > 0 by the series = ^^=-oo e _7rn2t , then the 
relation (3) says precisely that 

(4) i}(t) = for t > 0. 


This equation will be used in Chapter 6 to derive the key functional 

equation of the Riemann zeta function, and this leads to its analytic 

continuation. Also, the general case a G M will be used in Chapter 10 
to establish a corresponding law for the more general Jacobi theta func- 
tion ㊀. 

For another application of the Poisson summation formula we recall 
that we proved in Example 3, Chapter 3, that the function 1/cosh nx 
was also its own Fourier transform: 

/ °° g —27T2X^ 1 

cosh 7TX cosh 7T^ 


This implies that if t > 0 and a G IR, then the Fourier transform of the 
function f{x) = e ~ 2lviax / cosh(7rx/t) is /(0 = t/ cosh(7r(^ + a)t), and the 
Poisson summation formula yields 


(5) 


27rzan 

V _ _ 

cosh(7rn/t) 


oo 

E 


t 

cosh(7r(n + a)t) 


This formula will be used in Chapter 10 in the context of the two-squares 
theorem. 
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3 Paley-Wiener theorem 

In this section we change our point of view somewhat: we do not sup¬ 
pose any analyticity of /, but we do assume the validity of the Fourier 
inversion formula 

f ( x )= 「 / ⑹ e 2 一屯 if 「 f ( x ) e - 2 ^ dx , 

under the conditions \ f(x)\ < ^4/(1 + x 2 ) and |/(^)| < A r /{I + ^ 2 ). For a 
proof of the inversion formula under these conditions, we refer the reader 
to Chapter 5 in Book I. 

We start by pointing out a partial converse to Theorem 2.1. 

Theorem 3.1 Suppose f satisfies the decay condition |/(^)| < Ae~ 2na ^ 
for some constants A > 0. Then f(x) is the restriction to M of a 
function f(z) holomorphic in the strip = {z E C : |Im( 2 ：)| < 6 }， for 
any 0 < b < a. 

Proof. Define 

fn ( Z ) 二 「 /(〜 一叱 

J —n 

and note that f n is entire by Theorem 5.4 in Chapter 2. Observe also 
that f(z) may be defined for all z in the strip Sb by 

fiz ) 二 /°°/ ⑹ e 27 ^ 2 炎， 

J —OO 


because the integral converges absolutely by our assumption on /: it is 
majorized by 

A f e- 2 ™ 旧 e 2vrb|€l 炎， 

J —OO 

which is finite if 6 < a. Moreover, for z E Sb 

\f(z) - f n (z)\ <A f e _ 2 崎 l e 2 邱 I 成 

—> 0 as n —• oo, 

and thus the sequence {/ n } converges to / uniformly in Sb ： which, by 
Theorem 5.2 in Chapter 2, proves the theorem. 

We digress briefly to make the following observation. 
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Corollary 3.2 ///(0 = 0(e _2?ra ^l) for some a > 0, and f vanishes in 
a non-empty open interval, then f = 0. 

Since by the theorem / is analytic in a region containing the real line, the 
corollary is a consequence of Theorem 4.8 in Chapter 2. In particular, 
we recover the fact proved in Exercise 21, Chapter 5 in Book I, namely 
that / and / cannot both have compact support unless / = 0. 

The Paley-Wiener theorem goes a step further than the previous theo¬ 
rem, and describes the nature of those functions whose Fourier transforms 
are supported in a given interval [—M, M]. 

Theorem 3.3 Suppose f is continuous and of moderate decrease on 
M. Then, f has an extension to the complex plane that is entire with 
\f(z)\ < Ae 27rM l z l for some ^4 > 0 ， if and only if f is supported in the 
interval [—M ， M]. 

One direction is simple. Suppose / is supported in [—M, M\. Then 
both / and / have moderate decrease, and the Fourier inversion formula 
applies 

/ M 

/ ⑹ 成 . 

-M 

Since the range of integration is finite, we may replace x by the complex 
variable z in the integral, thereby defining a complex-valued function on 
Cby 

/ M 

f(0e 2 ^ z d^ 

-M 

By construction g(z) = f(z) if z is real, and g is holomorphic by Theo¬ 
rem 5.4 in Chapter 2. Finally, if z = x iy, we have 

/ M 

\m\e~ 2 ^ y d^ 

-M 

< Ae 2nM ^. 

The converse result requires a little more work. It starts with the 
observation that if / were supported in [—M, M], then the argument 
above would give the stronger bound \f(z)\ < Ae 2lT ^ instead of what we 
assume, that is \f(z)\ < Ae 27T ' z K The idea is then to try to reduce to the 
better situation, where this stronger bound holds. However, this is not 
quite enough because we need in addition a (moderate) decay as x —»• cxd 
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(when y 7 ^ 0 ) to deal with the convergence of certain integrals at infinity. 
Thus we begin by also assuming this further property of /, and then we 
remove the additional assumptions, one step at a time. 

Step 1. We first assume that / is holomorphic in the complex plane, 
and satisfies the following condition regarding decay in x and growth 
in y\ 

2irM\y\ 

( 6 ) \f(x + iy)\ < A， 1 + x 2 - 


We then prove under this stronger assumption that / ⑹ = 0 if |^| > M. 
To see this, we first suppose that ^ > M and write 




f{x)e~ 2 ^ x dx 

f{x - iy)e- 2ni ^ x ~ iy) dx. 


Here we have shifted the real line down by an amount y > 0 using the 
standard argument (equation (1)). Putting absolute values gives the 
bound 


\m<A' 


e 2TTMy—2n^y 


J-oo 1 + x2 

< Ce~ 2ny ^~ M) . 


dx 


Letting y tend to infinity, and recalling that ^ — M > 0, proves that 
/(C) = 0 . A similar argument, shifting the contour u.p by y > 0 , proves 
that /(0 = 0 whenever ^ < —M. 

Step 2. We relax condition ( 6 ) by assuming only that / satisfies 
⑺ \f(x + iy)\<Ae 2 ^. 


This is still a stronger condition than in the theorem, but it is weaker 
than ( 6 ). Suppose first that ^ > M, and for e > 0 consider the following 
auxiliary function 


fe(z )= 


/ ⑷ 

(1 + iez) 2 


We observe that the quantity 1/(1 + iez) 2 has absolute value less than 
or equal to 1 in the closed lower half-plane (including the real line) and 
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converges to 1 as e tends to 0. In particular, this shows that / e (^) —> / ⑹ 
as € —> 0 since we may write 

\fe(0 - f(0\ < j_Jf(x)\ (1 / ㈣ 2 - 1 dx, 

and recall that / has moderate decrease on M. 

But for each fixed e, we have 

p 27rM\y\ 

\f £ ( X + iy)\<A"-—^, 

1 x z 

so by Step 1 we must have / e (^) = 0, and hence /(0 = 0 after passing to 
the limit as e —>• 0. A similar argument applies if ^ < —M, although we 
must now argue in the upper half-plane, and use the factor 1/(1 — iez) 2 
instead. 

Step 3. To conclude the proof, it suffices to show that the conditions 
in the theorem imply condition (7) in Step 2. In fact, after dividing by 
an appropriate constant, it suffices to show that if \f(x)\ < 1 for all real 
x, and \f(z)\ < e 2lvM ^ for all complex z, then 

\f(x + iy)\<e 2 - M M. 


This will follow from an ingenious and very useful idea of Phragmen and 
Lindelof that allows one to adapt the maximum modulus principle to 
various unbounded regions. The particular result we need is as follows. 

Theorem 3.4 Suppose F is a holomorphic function in the sector 

S = {z : — 7r/4 < argz < 7r/4} 

that is continuous on the closure of S. Assume |』 ？ (之)| <1 on the bound¬ 
ary of the sector, and that there are constants C, c > 0 such that 
|F( 2 ：)| < Ce c \ z \ for all z in the sector. Then 

\F( Z )\ ^ 1 for all z ^ S. 

In other words, if F is bounded by 1 on the boundary of S and has no 

more than a reasonable amount of growth, then F is actually bounded 

everywhere by 1. That some restriction on the growth of F is necessary 

2 

follows from a simple observation. Consider the function F(z) = e z . 
Then F is bounded by 1 on the boundary of 5, but if x is real, F(x) is 
unbounded as a; —> oo. We now give the proof of Theorem 3.4. 
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Proof. The idea is to subdue the “enemy” function e z2 and turn it 
to our advantage: in brief, one modifies e z by replacing it by with 
a < 2. For simplicity we take the case a = 3/2. 

If e > 0, let 

F e (z) = F(z)e~ ez3/ \ 

Here we have chosen the principal branch of the logarithm to define z 3 ^ 2 
so that if z = re l ° (with —7r < 6 < 7r), then z 3 ^ 2 = r 3 / 2 e 3 ^/ 2 . Hence F e 
is holomorphic in S and continuous up to its boundary. Also 

\ e ~ ez3/2 \ = e -er 3/2 cos(36>/2). 

and since —7r/4 < 6 < 7r/4 in the sector, we get the inequalities 

7T 37T 39 3?r 7T 

~2 <_ Y < T < T < 2 1 

and therefore cos(30/2) is strictly positive in the sector. This, together 
with the fact that \F(z)\ < Ce c l 2 l, shows that F e (z) decreases rapidly in 
the closed sector as | 之 | —»• oo, and in particular F e is bounded. We claim 
that in fact |^^( 2 ：)| < 1 for all z E S, where S denotes the closure of S. 
To prove this, we define 

M = sup \F e (z)\. 

z^S 

Assuming F is not identically zero, let {wj} be a sequence of points 
such that |F e (^-)| ^ M. Since M _ 0 and F e decays to 0 as |z| becomes 
large in the sector, Wj cannot escape to infinity, and we conclude that 
this sequence accumulates to a point w ^ S. By the maximum principle, 
w cannot be an interior point of S, so w lies on its boundary. But on the 
boundary, we have first |F( 2 ：)| < 1 by assumption, and also \e~ ez \ < 1, 
so that M < 1, and the claim is proved. 

Finally, we may let e tend to 0 to conclude the proof of the theorem. 

Further generalizations of the Phragmen-Lindelof theorem are included 
in Exercise 9 and Problem 3. 

We must now use this result to conclude the proof of the Paley- 
Wiener theorem, that is, show that if |/(x)| < 1 and \f(z)\ < e 27rM l 2 ： l, 
then |/(z)| < e 27rM|2/|^ First, note that the sector in the Phragmen- 
Lindelof theorem can be rotated, say to the first quadrant Q = {z = 
x + iy : x > 0, y > 0}, and the result remains the same. Then, we con¬ 
sider 

2iriMz 


F(z)^ f(z)e 
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and note that F is bounded by 1 on the positive real and positive imag¬ 
inary axes. Since we also have |i^( 2 ：)| < Ce c ^ in the quadrant, we con¬ 
clude by the Phragmen-Lindelof theorem that |F( 2 ：)| < 1 for all 2 ： in Q, 
which implies |/( 2 ：)| < e 27vMy . A similar argument for the other quad¬ 
rants concludes Step 3 as well as the proof of the Paley-Wiener theorem. 

We conclude with another version of the idea behind the Paley-Wiener 
theorem, this time characterizing the functions whose Fourier transform 
vanishes for all negative 

Theorem 3.5 Suppose f and f have moderate decrease. Then /(C)= 
0 for all $ < 0 if and only if f can be extended to a continuous and 
bounded function in the closed upper half-plane {z = x iy : y > 0} with 
f holomorphic in the interior. 

Proof. First assume /(() = 0 for ^ < 0. By the Fourier inversion 
formula 

m 二 厂 / w 城处， 

and we can extend / when z = x + iy with ? / > 0 by 

疒 oo 

f(z)= / ⑹ e 2 喊成 . 

Jo 

Notice that the integral converges and that 

lf{z)l ~ A l T^ <0 °’ 

which proves the boundedness of /. The uniform convergence of 

/»n 

fn(Z)= f{^ 2niXi di 

Jq 

to f(z) in the closed half-plane establishes the continuity of / there, and 
its holomorphicity in the interior. 

For the converse, we argue in the spirit of the proof of Theorem 3.3. 
For e and 6 positive, we set 


fe,s(z) 


f{z + i5) 
(1 — iez) 2 


Then is holomorphic in a region containing the closed upper half¬ 


plane. One also shows as before, using Cauchy’s theorem, that = 0 
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for all ^ < 0. Then, passing to the limit successively, one has / e ,o(0 — 0 
for ^ < 0, and finally /⑹ =/o,o(C) = 0 for all ^ < 0. 


Remark. The reader should note a certain analogy between the above 
theorem and Theorem 7.1 in Chapter 3. Here we are dealing with a 
function holomorphic in the upper half-plane, and there with a function 
holomorphic in a disc. In the present case the Fourier transform vanishes 
when ^ < 0, and in the earlier case, the Fourier coefficients vanish when 
n < 0. 


4 Exercises 

1. Suppose / is continuous and of moderate decrease, and /(f) = 0 for all f G M. 
Show that / = 0 by completing the following outline: 

(a) For each fixed real number t consider the two functions 

A(z) = f /(x)e— 27rt2 ( x —0 cte and B(z) = — f f(x)e~ 27rxz( ' x ~ t ^ dx. 
J —oo J t 

Show that = B(^) for all f G M. 

(b) Prove that the function F equal to A in the closed upper half-plane, and B 
in the lower half-plane, is entire and bounded, thus constant. In fact, show 
that F = 0. 

(c) Deduce that 

f(x) dx = 0 , 

for all t, and conclude that / = 0. 



2. If / G 5a with a > 0, then for any positive integer n one has /( n ) G 5b whenever 
0 < b < a. 

[Hint: Modify the solution to Exercise 8 in Chapter 2.] 


3. Show, by contour integration, that if a > 0 and ^ G M then 

- r a -2^ dx = e -2na\C\^ 

7T J_ no a 2 +x 2 


and check that 


-27ra|^| ^2Tri^x 


dC 


tv a 2 x 2 
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4. Suppose Q is a polynomial of degree > 2 with distinct roots, none lying on the 
real axis. Calculate 



e — 2nix$ 

QO) 


dx, 


$ GM 


in terms of the roots of Q. What happens when several roots coincide? 
[Hint: Consider separately the cases ^ < 0, ^ = 0, and f > 0. Use residues.] 


5. More generally, let R(x) = P{x)/Q{x) be a rational function with (degree Q) > 
(degreeP)+2 and Q(x) 7 ^ 0 on the real axis. 

(a) Prove that if ai,...，afc are the roots of R in the upper half-plane, then 
there exists polynomials Pj(^) of degree less than the multiplicity of aj so 
that 


f R(x)e~ 2nzx ^ dx = Pj(^)e~ 27riotj ^, when ^ < 0. 

^—00 j = i 

(b) In particular, if Q(z) has no zeros in the upper half-plane, then 
R(x)e~ 2vrix( dx = 0for^< 0. 

(c) Show that similar results hold in the case ^ > 0. 

(d) Show that 

R{x)e~ 2nixi dx = 0(e _a|?l ), ^ 

as |f| —>• oo for some a > 0. Determine the best possible a’s in terms of the 
roots of R. 

[Hint: For part (a), use residues. The powers of ^ appear when one differentiates 
the function f(z) = R(z)e~ 2nzz ^ (as in the formula of Theorem 1.4 in the previous 
chapter). For part (c) argue in the lower half-plane.] 



6. Prove that 


E 


a 2 + n 2 


E 


-2ira\n\ 


whenever a > 0. Hence show that the sum equals coth 7 ra. 

7. The Poisson summation formula applied to specific examples often provides 
interesting identities. 

(a) Let r be fixed with Im(r) > 0. Apply the Poisson summation formula to 

f(z) = (r+ z) _fc , 
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where k is an integer > 2 , to obtain 

1 _ ( —27rz) fc ^k-1 ^2irimr 

^ (r + n) k = {k-l)\ ^ m 6 


(b) Set = 2 in the above formula to show that if Im(r) > 0, then 


UO 1 

(r + n) 2 


7T 


2 


sin 2 (7T7") 


(c) Can one conclude that the above formula holds true whenever r is any 
complex number that is not an integer? 

[Hint: For (a), use residues to prove that / ⑹ = 0, if ^ < 0, and 
加 = when f > 0 .] 


8 . Suppose / has compact support contained in [—M, M] and let f(z) = a n z n . 

Show that 


CZn = 


(27ri) n 

n\ 



fiOCd^ 


and as a result 

limsup(n!|a n |) ly/n < 2nM. 


In the converse direction, let f be any power series f(z) = CL n z n with 

limsup n _, 00 (n!|a n |) 1 ^ n < 2ttM. Then, / is holomorphic in the complex plane, 
and for every e > 0 there exists A e > 0 such that 

\f(z)\<A c e 2n(M+eW . 


9. Here are further results similar to the Phragmen-Lindelof theorem. 

(a) Let F be a holomorphic function in the right half-plane that extends continu¬ 
ously to the boundary, that is, the imaginary axis. Suppose that \F(iy)\ < 1 
for all y G M, and 

\F(z)\ < Ce c ^ 

for some c, C > 0 and 7 < 1. Prove that |-F(^)| < 1 for all 2 ： in the right 
half-plane. 
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(b) More generally, let 5 be a sector whose vertex is the origin, and forming an 
angle of tt//3. Let F be a holomorphic function in S 1 that is continuous on 
the closure of 5, so that |F(a：)| < 1 on the boundary of S and 

\F(z)\ < (7e cW °for all z G 5 

for some c,C > 0 and 0 < a < /3. Prove that |F( 2 ：)| < 1 for all z £ S. 


_ 2 

10. This exercise generalizes some of the properties of e~^ x related to the fact 
that it is its own Fourier transform. 

Suppose f(z) is an entire function that satisfies 

\f{x + iy)\ < ce~ ax2+by2 


for some a, 6, c > 0. Let 


/(C) = [ f(x)e~ 2nixC dx. 

J —— oo 


Then, / is an entire function of C, that satisfies 

|/(f+ ^)| < ce~ a， ^ +h，r]2 

for some a’, 6’ ， c’ > 0. 

[Hint: To prove /($) = 0(e~ a 《 2 ), assume ^ > 0 and change the contour of inte¬ 
gration to x — iy for some y > 0 fixed, and —oo < x < oo. Then 

f^) = 0(e~ 2ny ^e by2 ). 


Finally, choose y = where d is a small constant.] 


11. One can give a neater formulation of the result in Exercise 10 by proving the 
following fact. 

Suppose f(z) is an entire function of strict order 2, that is, 


f(z) = 0(e^' 2 ) 


for some c\ > 0. Suppose also that for x real, 

f(x) = 0(e- C2lx ' 2 ) 

for some C 2 > 0. Then 

\f(x-\-iy)\ = 0(e~ ax2+by2 
for some a, 6 > 0. The converse is obviously true. 
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12. The principle that a function and its Fourier transform cannot both be too 
small at infinity is illustrated by the following theorem of Hardy. 

If / is a function on R that satisfies 

f(x) = 0(e _7ra;2 ) and /($) = 0(e~^ 2 ), 


then / is a constant multiple of e _7ra：2 . As a result, if f{x) = 0(e _7rAa;2 ), and 
/(f) = 0(e _7rS ^ 2 ), with AB > 1 and A,B>0, then / is identically zero. 


(a) If / is even, show that / extends to an even entire function. Moreover, if 
g(z) = /( 之 " 2 )， then g satisfies 

\g{x)\ < ce _7rx and \g(z)\ < ce^^ 8111 ( 0 / 2 ) $ 

when a: G M and 2 ： = Re lB with R> 0 and ^ G M. 

(b) Apply the Phragmen-Lindelof principle to the function 


F(z)= g(z)e^ 


where = in 


e —i7r/(2/3) 

sin7r/ (2/3) 


and the sector 0 < ^ < 7v/P < 7r, and let /3 ^ n to deduce that e nz g(z) is 
bounded in the closed upper half-plane. The same result holds in the lower 
half-plane, so by Liouville’s theorem e KZ g{z) is constant, as desired. 

(c) If / is odd, then /(0) = 0, and apply the above argument to f(z)/z to deduce 
that / = / = 0. Finally, write an arbitrary / as an appropriate sum of an 
even function and an odd function. 


5 Problems 

1. Suppose /(0 = 0(e~ a ^ P ) as |^| —>• oo, for some p > 1. Then / is holomorphic 
for all z and satisfies the growth condition 

\f(z)\<Ae a ^ q 


where 1/p 1/q = 1 . 

Note that on the one hand, when p ^ oo then q 1, and this limiting case 
can be interpreted as part of Theorem 3.3. On the other hand, when p 1 then 
q oo, and this limiting case in a sense brings us back to Theorem 2.1. 

[Hint: To prove the result, use the inequality —^ p < u q , which is valid when 

f and u are non-negative. To establish this inequality, examine separately the 
cases and note also that the functions ^ = u q ~ x and u = 1 are 

inverses of each other because (p — l)(q — 1) = 1.] 
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2 . The problem is to solve the differential equation 
d n cT_i 

⑷ + a n -i-^ =l u{t) + ••• + a 0 u(t)m f(t), 

where ao, ai,.. •, a n are complex constants, and / is a given function. Here we 
suppose that / has bounded support and is smooth (say of class C 2 ). 

(a) Let 


/w 




’ dt. 


Observe that / is an entire function, and using integration by parts show 
that 


\f(x-\-iy)\ < 


A 


l-\- x 2 

if \y\ < a for any fixed a > 0. 

(b) Write 

P(z) = a n (27viz) n + a n -i(27rzz) n_1 + … + a。. 
Find a real number c so that P{z) does not vanish on the line 
L = {z : z = x ic, x G M}. 


(c) Set 


u(t) = Jl -p^y/(«) dz - 


Check that 


j=o 


dt, 


u(t) = e 27rxzt f(z) dz 


and 


2irizt / »/ \ j / 2nixt n / \ 7 

e j(z) az= e j(x) ax. 


Conclude by the Fourier inversion theorem that 


a i ( 4 ) u(t)= 期 . 


j=o 
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Note that the solution u depends on the choice c. 

3.* In this problem, we investigate the behavior of certain bounded holomorphic 
functions in an infinite strip. The particular result described here is sometimes 
called the three-lines lemma. 

(a) Suppose F(z) is holomorphic and bounded in the strip 0 < Im(z) < 1 and 
continuous on its closure. If < 1 on the boundary lines, then 

|F( 2 ：)| < 1 throughout the strip. 


(b) For the more general F, let sup a . €E |F(a:)| = Mo and sup a , GR \F(x + z)| = 
Mi. Then, 

sup \F(x < Ml~ y Ml, if 0 < ? / < 1. 


(c) As a consequence, prove that logsup xGM |F(x + iy)\ is a convex function of 
y when 0 < 2 / < 1. 

[Hint: For part (a), apply the maximum modulus principle to F e (z) = F{z)e~ ez . 
For part (b), consider Mq - 1 M^ z F(z).] 

4.* There is a relation between the Paley-Wiener theorem and an earlier represen¬ 
tation due to E. Borel. 

(a) A function f(z), holomorphic for all 2 ：, satisfies \f(z)\ < A e e 2?r ( M+e )l z l for 
all e if and only if it is representable in the form 



where g is holomorphic outside the circle of radius M centered at the origin, 
and g vanishes at infinity. Here C is any circle centered at the origin of radius 
larger than M. In fact, if f(z) = ^2 a n z n , then g(w) = AnW -71-1 with 

a n = A n {2ni) n+1 /n\. 

(b) The connection with Theorem 3.3 is as follows. For these functions / (for 
which in addition / and / are of moderate decrease on the real axis), one can 
assert that the g above is holomorphic in the larger region, which consists 
of the slit plane C — [—M, M]. Moreover, the relation between g and the 
Fourier transform / is 



so that / represents the jump of g across the segment [—M, M]; that is, 



lim 

e — ^0,e〉0 


9 (x + ie) -g(x-ie). 


See Problem 5 in Chapter 3. 
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...but after the 15 th of October I felt myself a free 
man, with such longing for mathematical work, that 
the last two months flew by quickly, and that only 
today I found the letter of the 19 th of October that I 
had not answered. The result of my work, with which 
I am not entirely satisfied, I want to share with you. 

Firstly, in looking back at my lectures, a gap in 
function theory needed to be filled. As you know, up 
to now the following question had been unresolved. 
Given an arbitrary sequence of complex numbers, 
ai, a 2 ,.. •, can one construct an entire (transcenden¬ 
tal) function that vanishes at these values, with pre¬ 
scribed multiplicities, and nowhere else?... 

K. Weierstrass, 1874 


In this chapter, we will study functions that are holomorphic in the 
whole complex plane; these are called entire functions. Our presentation 
will be organized around the following three questions: 

1. Where can such functions vanish? We shall see that the obvious 
necessary condition is also sufficient: if {z n } is any sequence of 
complex numbers having no limit point in C, then there exists an 
entire function vanishing exactly at the points of this sequence. The 
construction of the desired function is inspired by Euler’s product 
formula for sin7T2 ： (the prototypical case when {z n } is Z), but re¬ 
quires an additional refinement: the Weierstrass canonical factors. 

2. How do these functions grow at infinity? Here, matters are con¬ 
trolled by an important principle: the larger a function is, the more 
zeros it can have. This principle already manifests itself in the sim¬ 
ple case of polynomials. By the fundamental theorem of algebra, 
the number of zeros of a polynomial P of degree d is precisely d, 
which is also the exponent in the order of (polynomial) growth of 
P, namely 


sup \P{z)\ w R d as R — oo. 
\z\=R 
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A precise version of this general principle is contained in Jensen’s 
formula, which we prove in the first section. This formula, central 
to much of the theory developed in this chapter, exhibits a deep 
connection between the number of zeros of a function in a disc and 
the (logarithmic) average of the function over the circle. In fact, 
Jensen’s formula not only constitutes a natural starting point for 
us, but also leads to the fruitful theory of value distributions, also 
called Nevanlinna theory (which, however, we do not take up here). 


3. To what extent are these functions determined by their zeros? It 
turns out that if an entire function has a finite (exponential) order 
of growth, then it can be specified by its zeros up to multiplication 
by a simple factor. The precise version of this assertion is the 
Hadamard factorization theorem. It may be viewed as another 
instance of the general rule that was formulated in Chapter 3, that 
is, that under appropriate conditions, a holomorphic function is 
essentially determined by its zeros. 

1 Jensen’s formula 

In this section, we denote by Dr and Cr the open disc and circle of 
radius R centered at the origin. We shall also, in the rest of this chapter, 
exclude the trivial case of the function that vanishes identically. 

Theorem 1.1 Let ft be an open set that contains the closure of a disc 
Dr and suppose that f is holomorphic in /(0) ^ 0, and f vanishes 
nowhere on the circle Cr. If ^ zn denote the zeros of f inside the 

disc (counted with multiplicities)^ then 

N 

⑴ log 1/ ⑼ I = flog 

k=l 

The proof of the theorem consists of several steps. 

Step 1. First, we observe that if /i and are two functions satisfying 
the hypotheses and the conclusion of the theorem, then the product 
/ 1/2 also satisfies the hypothesis of the theorem and formula (1). This 
observation is a simple consequence of the fact that log xy = logx + log y 
whenever x and y are positive numbers, and that the set of zeros of / 1/2 
is the union of the sets of zeros of fi and / 2 . 


1 That is, each zero appears in the sequence as many times as its order. 


( 學 ) +&r bgi/ _) ㈣ 
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Step 2. The function 

，、— _ f{z) _ 

9 Z (z - Zi) ■ ■ ■ (z - Z N ) 

initially defined on — {zi,..., zn}, is bounded near each Zj. Therefore 
each Zj is a removable singularity, and hence we can write 

f{z) = 0 - A)... ( 2 ； - z N )g{z) 

where g is holomorphic in Cl and nowhere vanishing in the closure of Dr. 
By Step 1, it suffices to prove Jensen’s formula for functions like g that 
vanish nowhere, and for functions of the form z — Zj. 

Step 3. We first prove (1) for a function g that vanishes nowhere in the 
closure of Dr. More precisely, we must establish the following identity: 

1 f 27r 

log Iff(0)1 ^ ^ J lo § \9(Re te )\d9. 

In a slightly larger disc, we can write g(z) = e h ^ where h is holomorphic 
in that disc. This is possible since discs are simply connected, and we 
can define h = logg (see Theorem 6.2 in Chapter 3). Now we observe 
that 

\g(^z)\ — \e h ^\ = | e Re ( /l ( 2： ))+ iIm ( /l ( 2： ))| — e Re(/i ⑷）， 

so that log |^( 2 ：)| = Re(/i(z)). The mean value property (Corollary 7.3 in 
Chapter 3) then immediately implies the desired formula for g. 

Step 4. The last step is to prove the formula for functions of the form 
f(z) = z — w, where w G Dr. That is, we must show that 

1 /* 27r 

+ — / log \Re ld — w\d6. 

2?r Jo 

Since log(|iy|/i?) =log \w\ — log R and log \Re l6 — 切 | = logi?+log \e l6 — w/R\ 
it suffices to prove that 

2tt 

log \e l6 — a\d9 = 0, whenever |a| < 1. 

This in turn is equivalent (after the change of variables 6 i—> —6) to 
log 11 — ae lG \ d0 = 0, whenever \a\ < 1. 



w 


log 卜 I = log (— 


2tt 
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To prove this, we use the function F(z) = 1 — az, which vanishes nowhere 
in the closure of the unit disc. As a consequence, there exists a holomor- 
phic function G in a disc of radius greater than 1 such that F(z)= 
e G ( 2 ). Then |F| = e Re ( G ), and therefore log |F| = Re(G). Since F(0) = 1 
we have log |F(0)| = 0, and an application of the mean value property 
(Corollary 7.3 in Chapter 3) to the harmonic function log |-F(^)| con¬ 
cludes the proof of the theorem. 

From Jensen’s formula we can derive an identity linking the growth of 
a holomorphic function with its number of zeros inside a disc. If / is a 
holomorphic function on the closure of a disc Dr, we denote by n(r) (or 
ri/(r) when it is necessary to keep track of the function in question) the 
number of zeros of / (counted with their multiplicities) inside the disc 
D r , with 0 < r < R. A simple but useful observation is that n(r) is a 
non-decreasing function of r. 

We claim that if /(0) ^ 0, and / does not vanish on the circle Cr, 
then 



l O g|/(i? e i0 )| 洲 一 log|/(0)|. 


( 2 ) 


This formula is immediate from Jensen’s equality and the next lemma. 
Lemma 1.2 If zi,are the zeros of f inside the disc Dr, then 



Proof. First we have 



If we define the characteristic function 





N f R 

^2 / Vk(r) 
k=i Jo 
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2 Functions of finite order 

Let f be an entire function. If there exist a positive number p and 
constants A^B > 0 such that 

\f{z)\ < Ae B ^ P for all z E C, 

then we say that / has an order of growth < p. We define the order 
of growth of / as 


Pf = inf p, 

where the infimum is over all p > 0 such that / has an order of growth 
< P- 

2 

For example, the order of growth of the function e z is 2. 

Theorem 2.1 If f is an entire function that has an order of growth < p, 
then: 

(i) n(r) < Cr p for some C > 0 and all sufficiently large r. 

(ii) // 么 1 , 之 2 , ... denote the zeros of f, with Zk 7 ^ 0 7 then for all s > p 
we have 



Proof. It suffices to prove the estimate for n(r) when /(0) ^ 0. Indeed, 
consider the function F(z) = f(z)/z £ where i is the order of the zero of 
/at the origin. Then n/(r) and np(r) differ only by a constant, and F 
also has an of order of growth < p. 

If /(0) 7 ^ 0 we may use formula (2), namely 


r R 


n(x) 


dx 


r»27r 


x 2 tt 

Choosing R = 2r, this formula implies 


\og\f(Re ie )\dd-log\f(0)\. 


r 2r 


dx 




n(x) — < — / log \ f (Re l6 )\d9-log\f{0)\. 


x ~ 2 tt 


On the one hand, since n(r) is increasing, we have 


n(x) — > n(r) — = n(r) [log 2 r - log r] = n(r) log 2 , 

r- ^ Jr- ^ 
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and on the other hand, the growth condition on / gives 



1 0§ |/(^)|^< 



log|^e SflP |^ < C'r p 


for all large r. Consequently, n(r) < Cr p for an appropriate C > 0 and 
all sufficiently large r. 

The following estimates prove the second part of the theorem: 


^2 \ zk \ s = 

\z k \>l 3=0 \20<\z k \<2^ 

oo 

<^22~ js n{2 j+1 ) 

j=0 

oo 

< c ^2~ js 2^ 1)p 
j=o 

oo 

< c'^2(2 p - s y 

j=0 

< OO. 


The last series converges because s > p. 

Part (ii) of the theorem is a noteworthy fact, which we shall use in a 
later part of this chapter. 

We give two simple examples of the theorem; each of these shows that 
the condition s > p cannot be improved. 

Example 1. Consider f(z) = sin nz. Recall Euler’s identity, namely 


which implies that |/(z)| < e^ z \ and / has an order of growth < 1. By 
taking z = ix, where x G M, it is clear that the order of growth of / is 
actually equal to 1. However, / vanishes to order 1 at 2 ： = n for each 
n G Z, and l/|n| s < cxd precisely when 5 > 1. 

Example 2. Consider f(z) = cosz" 2 , which we define by 


cos 


么 1/2 二 Er - 1 ， 


n=0 


(2n)!' 
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Then / is entire, and it is easy to see that 

1 / ⑷ IW " 2 ， 

and the order of growth of / is 1/2. Moreover, f(z) vanishes when 
z n = ((n + l/2)7r) 2 , while l/|z n | s < oo exactly when s > 1/2. 

A natural question is whether or not, given any sequence of complex 
numbers 之 1 ，之 2 , ..there exists an entire function / with zeros precisely 
at the points of this sequence. A necessary condition is that 2 ： i, 2 ： 2 ,... do 
not accumulate, in other words we must have 

lim \z k \ = oo , 

fc—>-oo 

otherwise / would vanish identically by Theorem 4.8 in Chapter 2. Weier- 
strass proved that this condition is also sufficient by explicitly construct¬ 
ing a function with these prescribed zeros. A first guess is of course the 
product 

(z- zi){z - 名 2 ) …， 

which provides a solution in the special case when the sequence of zeros 
is finite. In general, Weierstrass showed how to insert factors in this 
product so that the convergence is guaranteed, yet no new zeros are 
introduced. 

Before coming to the general construction, we review infinite products 
and study a basic example. 


3 Infinite products 

3.1 Generalities 


Given a sequence {a n }^ =1 of complex numbers, we say that the product 


converges if the limit 


oo 

J^[(l + a n ) 

71=1 


lim 

N^oo 


N 

J^[(l + a n ) 

n=l 


of the partial products exists. 

A useful necessary condition that guarantees the existence of a product 
is contained in the following proposition. 
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Proposition 3.1 If ^ \a n \ < oo, then the product f|^ =1 (l + a n ) con- 
verges. Moreover, the product converges to 0 if and only if one of its 
factors is 0. 

This is simply Proposition 1.9 of Chapter 8 in Book I. We repeat the 
proof here. 

Proof. If ^2 \a n \ converges, then for all large n we must have 
|a n | < 1/2. Disregarding if necessary finitely many terms, we may as¬ 
sume that this inequality holds for all n. In particular, we can define 
log(l + a n ) by the usual power series (see (6) in Chapter 3), and this 
logarithm satisfies the property that 1 -j- z = e log ( 1+2 ) whenever |^| < 1. 
Hence we may write the partial products as follows: 

N N 

J](l + a „) ^ Y[ e^+a n ) = e B N ； 

n=l n=l 

where Bjsf = b n with b n = log(l + a n ). By the power series expan¬ 

sion we see that | log(l + 2 ：)| < 2|z|, if |z| < 1/2. Hence \b n \ < 2|a n |, so 
Bn converges as iV —> oc to a complex number, say B. Since the expo¬ 
nential function is continuous, we conclude that e Bw converges to e B as 
N —>• oo, proving the first assertion of the proposition. Observe also that 
if 1 + a n 7 ^ 0 for all n, then the product converges to a non-zero limit 
since it is expressed as e B . 

More generally, we can consider products of holomorphic functions. 

Proposition 3.2 Suppose {F n } is a sequence of holomorphic functions 
on the open set f]. If there exist constants c n > 0 such that 

c n < oo and \F n (z) — 1| < c n for all z E ft, 


then: 

(i) The product F n (z) converges uniformly in Q to a holomorphic 
function F(z). 

(ii) If F n [z) does not vanish for any n, then 

F\z) ^ K(z) 

F ⑷ -^FnizY 

Proof. To prove the first statement, note that for each 之 we may 
argue as in the previous proposition if we write F n (z) = 1 + a n (z), with 
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\ a n{z)\ < c n . Then, we observe that the estimates are actually uniform 
in 2 : because the c n ’s are constants. It follows that the product converges 
uniformly to a holomorphic function, which we denote by F(z). 

To establish the second part of the theorem, suppose that K is a 
compact subset of f], and let 


N 

G N (z)^l[F n (z). 

n=l 


We have just proved that ^ F uniformly in so by Theorem 5.3 
in Chapter 2, the sequence {G^} converges uniformly to F f in K. Since 
Gjv is uniformly bounded from below on if, we conclude that G’ N !Gn — 
F f /F uniformly on K ， and because K is an arbitrary compact subset of 
f], the limit holds for every point of O. Moreover, as we saw in Section 4 
of Chapter 3 


Gn 


N 


發， 

so part (ii) of the proposition is also proved. 


3.2 Example: the product formula for the sine function 

Before proceeding with the general theory of Weierstrass products, we 
consider the key example of the product formula for the sine function: 


( 3 ) 


Sin 7TZ 


7T 


n 




This identity will in turn be derived from the sum formula for the cotan¬ 
gent function (cot ttz = cos 7rz/ sin7T2：): 

1 r 1 I ^ 2z 

(4) 7T cot 7TZ = > - = lim > - = - + > - 

z + n n^-oo z -\- n z z 2 — n 2 

n=—oo |n|<iV n=l 


The first formula holds for all complex numbers 2 ：, and the second when¬ 
ever 2 : is not an integer. The sum Yl^L-oo 1/(^ + n ) needs to be properly 
understood, because the separate halves corresponding to positive and 
negative values of n do not converge. Only when interpreted symmetri¬ 
cally, as limiv—^oo X^|n|<iv 1/(^ + n )，does the cancellation of terms lead 
to a convergent series as in (4) above. 
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We prove (4) by showing that both n cot ttz and the series have the 
same structural properties. In fact, observe that if F(z) = tt cot nz, then 
F has the following three properties: 

(i) F(z + 1) = whenever z is not an integer. 

(ii) F(z) = - + Fq(z), where F 0 is analytic near 0. 

(iii) F(z) has simple poles at the integers, and no other singularities. 
Then, we note that the function 

V 丄 = lim E 丄 

z -\- n at—>- oo z -\- n 

n=—oo |n|<AT 


also satisfies these same three properties. In fact, property (i) is simply 
the observation that the passage from z to z -j- 1 merely shifts the terms 
in the infinite sum. To be precise, 


E 

\n\<N 


1 

z-\- 1 -\-n 


1 

7+TTiv 


l 

z-N 


\n\<N 


1 

z -\-n 


Letting N tend to infinity proves the assertion. Properties (ii) and (iii) 
are evident from the representation ^ z 2^ n 2 of the sum. 

Therefore, the function defined by 


△㈤ 二 F^) 



n=—oo 


is periodic in the sense that A(z + 1) = A(z), and by (ii) the singularity 
of A at the origin is removable, and hence by periodicity the singularities 
at all the integers are also removable; this implies that A is entire. 

To prove our formula, it will suffice to show that the function A is 
bounded in the complex plane. By the periodicity above, it is enough 
to do so in the strip |Re( 2 ：)| < 1/2. This is because every 2 / G C is of 
the form z' = z + A:, where 2 ; is in the strip and k is an integer. Since A 
is holomorphic, it is bounded in the rectangle |Im(z)| < 1, and we need 
only control the behavior of that function for |Im( 2 ：)| > 1. If Im(^) > 1 
and z = x iy ， then 


g27T2： _|_ ^ — inz 

cot 7TZ = i — -:— 

gZ7TZ _ g—27T2： 


e -27T2 / _|_ e -2nix 
g—27ry 一 ^—27rix 
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and in absolute value this quantity is bounded. Also 


-+ 




2z 


— n z x -\- xy 


E 


2(a ： + iy) 


x 2 — y 2 — n 2 -\- 2ixy 


therefore if y > 1, we have 


去 + E. 


2z 


< 


c + cE 


y 2 n 2 • 


Now the sum on the right-hand side is majorized by 

疒 oo 

Jo y 2 + x 2 dx ’ 

because the function y/{y 2 + x 2 ) is decreasing in x; moreover, as the 
change of variables x yx shows, the integral is independent of and 
hence bounded. By a similar argument A is bounded in the strip where 
Im( 2 ：) < —1, hence is bounded throughout the whole strip |Re(^)| < 1/2. 
Therefore A is bounded in C, and by Liouville’s theorem, △⑷ is con¬ 
stant. The observation that A is odd shows that this constant must be 
0, and concludes the proof of formula (4). 

To prove (3), we now let 


G(z ) 二 


sin7rz 

7T 


and 


p (和之 I 



Proposition 3.2 and the fact that ^ 1/n 2 < cxd guarantee that the prod¬ 
uct P(z) converges, and that away from the integers we have 


P\z) _ 1 
P{z) = z 

Since G f (z)/G(z) = 7rcot nz, the cotangent formula (4) gives 

P(z)\ _P(z) 

G(z)J ~ G(z) 

and so P(z) = cG(z) for some constant c. Dividing this identity by 之 , 
and taking the limit as 之 一 > 0, we find c = 1. 

Remark. Other proofs of (4) and (3) can be given by integrating 
analogous identities for tt 2 /(sin nz) 2 derived in Exercise 12, Chapter 3, 
and Exercise 7, Chapter 4. Still other proofs using Fourier series can be 
found in the exercises of Chapters 3 and 5 of Book I. 


P\z) G\z) 

W)~W) 


E. 


2z 


n 厶 
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4 Weierstrass infinite products 

We now turn to Weierstrass 5 s construction of an entire function with 
prescribed zeros. 

Theorem 4.1 Given any sequence {a n } of complex numbers with 
\a n \ —>• oo as n ^ oo, there exists an entire function f that vanishes at 
all z = a n and nowhere else. Any other such entire function is of the 
form f(z)e 9 ^ z \ where g is entire. 

Recall that if a holomorphic function / vanishes at z = a 、 then the 
multiplicity of the zero a is the integer m so that 

where g is holomorphic and nowhere vanishing in a neighborhood of a. 
Alternatively, m is the first non-zero power of 2 ： — a in the power series 
expansion of / at a. Since, as before, we allow for repetitions in the 
sequence {a n }, the theorem actually guarantees the existence of entire 
functions with prescribed zeros and with desired multiplicities. 

To begin the proof, note first that if /i and are two entire functions 
that vanish at all z = a n and nowhere else, then / 1//2 has removable 
singularities at all the points a n . Hence / 1//2 is entire and vanishes 
nowhere, so that there exists an entire function g with fi(z)/f 2 (z)= 
e 9 ^ z \ as we showed in Section 6 of Chapter 3. Therefore fi(z) = f 2 (z)e g ㈤ 
and the last statement of the theorem is verified. 

Hence we are left with the task of constructing a function that vanishes 
at all the points of the sequence {a n } and nowhere else. A naive guess, 
suggested by the product formula for sin 7rz, is the product (1 — z/a n ). 
The problem is that this product converges only for suitable sequences 
{a n }, so we correct this by inserting exponential factors. These factors 
will make the product converge without adding new zeros. 

For each integer fc > 0 we define canonical factors by 

E 0 (z) ^1-z and E k (z) = (1 - z)e z+z2/2+ ' +zk/k , for A: > 1. 

The integer k is called the degree of the canonical factor. 

Lemma 4.2 If | 2 ：| < 1/2 ， then |1 — E^{z)\ < c| 2 ：| fc+1 for some c > 0. 

Proof. If 1 2 ：I < 1/2, then with the logarithm defined in terms of the 
power series, we have 1 — z = e log ^~ z \ and therefore 

E k (Z ) 二 e ^~z)+z+z 2 /2+-+z k /k ^ e w^ 
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where w = — Yl^Lk+i 々 l n . Observe that since |z| 幺 1/2 we have 


kl < \z\ k+1 \z\ n ~ k ~ l /n<\z\ k+l ^2- j <2\z\ k+1 . 

n=fc+l j=0 

In particular, we have \w\ < 1 and this implies that 

|1 - E k (z)\ = |1 - e w \ < c'\w\ < c\z\ k+1 . 


Remark. An important technical point is that the constant c in the 
statement of the lemma can be chosen to be independent of k. In fact, 
an examination of the proof shows that we may take c! = e and then 
c = 2e. 

Suppose that we are given a zero of order m at the origin, and that 
ai, a 2 ... are all non-zero. Then we define the Weierstrass product by 

oo 

f(z) = W I] E n^/dn). 

n=l 

We claim that this function has the required properties; that is, / is 
entire with a zero of order m at the origin, zeros at each point of the 
sequence {a n }, and / vanishes nowhere else. 

Fix R > 0, and suppose that 之 belongs to the disc |z| < R. We shall 
prove that / has all the desired properties in this disc, and since R is 
arbitrary, this will prove the theorem. 

We can consider two types of factors in the formula defining /, with 
the choice depending on whether \a n \ < 2R or \a n \ > 2R. There are only 
finitely many terms of the first kind (since \a n \ —>• oo), and we see that 
the finite product vanishes at all z = a n with \a n \ < R. If \a n \ > 2R, we 
have \z/a n \ < 1/2, hence the previous lemma implies 

n+l 

|1 — E n (z/a n )\ <c— < +1 • 

Note that by the above remark, c does not depend on n. Therefore, the 
product 

五 ?1( 之 / ^n) 

\a n \>2R 


defines a holomorphic function when \z\ < i?, and does not vanish in 
that disc by the propositions in Section 3. This shows that the function 
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f has the desired properties, and the proof of Weierstrass’s theorem is 
complete. 

5 Hadamard’s factorization theorem 

The theorem of this section combines the results relating the growth of 
a function to the number of zeros it possesses, and the above product 
theorem. Weierstrass’s theorem states that a function that vanishes at 
the points ai, a 2 ,... takes the form 

oo 

e s ㈤， n E n (z/a n ). 

n=l 

Hadamard refined this result by showing that in the case of functions 
of finite order, the degree of the canonical factors can be taken to be 
constant, and g is then a polynomial. 

Recall that an entire function has an order of growth < p if 

\f(z)\ < Ae B \ z \ P , 

and that the order of growth po of / is the infimum of all such p’s. 

A basic result we proved earlier was that if / has order of growth < p, 
then 


n(r) < Cr p , for all large r, 

and if a i, » 2 ,... are the non-zero zeros of /, and s > p, then 

\a n \~ s < oo. 

Theorem 5.1 Suppose f is entire and has growth order po. Let k be the 
integer so that k < po < k 1. // ai, a 2 ,... denote the (non-zero) zeros 
of f, then 

oo 

f(z)^e p ^z m l[E k (z/a n ), 

n=l 

where P is a polynomial of degree < k, and m is the order of the zero of 
f at z = 0. 


Main lemmas 

Here we gather a few lemmas needed in the proof of Hadamard’s theorem. 
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Lemma 5.2 The canonical products satisfy 

\E k (z)\ >e~ c ^ k+1 if\z\ < 1/2 

and 

\E k (z)\>\i^z\e~ c， ^ k if\z\ > 1/2. 

Proof. If \z\ < 1/2 we can use the power series to define the logarithm 
of 1 — z, so that 

E k (z) = = e-E^k+i^/n = e w^ 

Since |e^| > e _ 卜 ■ and |^| < c|z| fc+1 , the first part of the lemma follows. 
For the second part, simply observe that if | 2 ：| > 1/2, then 

\E k (z)\ = \l-z\\e z+z2 / 2+ - +zk /% 

and that there exists 〆 > 0 such that 

| e _2 + 2 ： 2 / 2 +... + 2 ： fc / fc | 〉 |2 + 2 ： 2 / 2 +... + Z fc / fc | 〉 ^ — c '\ z\ k 

The inequality in the lemma when \z\ > 1/2 then follows from these 
observations. 

The key to the proof of Hadamard’s theorem consists of finding a lower 
bound for the product of the canonical factors when 2 ： stays away from 
the zeros {a n }. Therefore, we shall first estimate the product from below, 
in the complement of small discs centered at these points. 


Lemma 5.3 For any s with po < s < k 1 ? we have 

oo 

]jE k (z/a n ) >e- c ^\ 

n=l 

except possibly when z belongs to the union of the discs centered at a n of 
radius |a n | _/c_1 ; for n = 1,2,3,.... 

Proof. The proof this lemma is a little subtle. First, we write 

oo 

E k (z/a n ) = E k {z/a n ) JJ E k (z/a n ). 

n=l \a n \<2\z\ |a n |> 2 | 2 ：| 
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For the second product the estimate asserted above holds with no re¬ 
striction on Indeed, by the previous lemma 


五 ( 之 / ^n) 

\a n \>2\z\ 


=JJ \E k (z/a n )\ 

|a n |>2|2：| 

> 1 丄 e ~ c \ z / a n\ k + 1 

Wn\>^\z\ 

> e -oi^i fc+ 1 E kn | > 2 | , l i^r ,t -\ 


But |a n | > 2\z\ and 5 < A: + 1, so we must have 

|o n r fc_1 - lanl^lanl 8 ^- 1 < C|a n r s |^| s - fe - 1 

Therefore, the fact that \cL n \~ s converges implies that 


JJ E k (z/a n ) > e _c|2|S 

\a n \>2\z\ 


for some c > 0. 

To estimate the first product, we use the second part of Lemma 5.2, 
and write 


( 5 ) 


五 fc ( 之 / ^n) 

\an\^.2\z\ 


^ n 

1 -— 
CLr, 

\a n \<2\z\ 



n e 

\a n \<2\z\ 


c^z/an^ 


We now note that 


l 丄 ^~ c ， \ z / a n\ k = e ~ C， \ Z \ k \a n \<2\z\ \ an \ 

\a n \<2\z\ 

and again, we have |a n | _fc = |a n | _s |a n | s_fc < C\a n \~ s \z\ s ~ k , thereby prov¬ 
ing that 


n 


0 -c f \z/a n \ k > e -clzl 


Wn\<^\z\ 


It is the estimate on the first product on the right-hand side of (5) 
which requires the restriction on 2 ： imposed in the statement of the 
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lemma. Indeed, whenever 2 ： does not belong to a disc of radius \a n \~ k ~ l 
centered at a n , we must have \a n — z\> |a n |~ fc_1 . Therefore 


n 

1 -— 
a n 

= n 

a n - z 

a n 

Wn\<^\z\ 


Wn\<^\z\ 



> n kk 1 

Wn\<^\z\ 

- n i°-r fc ' 2 - 

Wn\<^\z\ 

Finally, the estimate for the first product follows from the fact that 
(fc + 2) ^ log|a n | < (fc + 2 )n( 2 | 2 |)log 2 |z| 

Wn\<^\z\ 

< c| 才 log2|a；| 

< d\z\ s， , 

for any s' > s, and the second inequality follows because n( 2 | 2 |) < c\z\ s 
by Theorem 2.1. Since we restricted s to satisfy s > po, we can take 
an initial 5 sufficiently close to po， so that the assertion of the lemma is 
established (with s being replaced by s’）. 


Corollary 5.4 There exists a sequence of radii, r*i,r 2 ,...， with 
r m —>• 00 ， such that 


JJ E k (z/a n ) > e _c|2|S for |^| = r m . 

n=l 

Proof. Since \ a n\~ k ~ 1 < 00 , there exists an integer N so that 


< 1 / 10 . 

n=N 

Therefore, given any two consecutive large integers L and L + 1, we can 
find a positive number r with L < r < L + 1, such that the circle of 
radius r centered at the origin does not intersect the forbidden discs of 
Lemma 5.3. For otherwise, the union of the intervals 


In 


K\ k +^ 


|a n | fe +! 
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(which are of length 2|a n | _fc_1 ) would cover all the interval [L, L + 1]. 
(See Figure 1.) This would imply 2 YI^Ln l a n「 fc_1 之 1 ， which is a con¬ 
tradiction. We can then apply the previous lemma with \z\ = r to con¬ 
clude the proof of the corollary. 



Figure 1. The intervals I n 


Proof of Hadamard’s theorem 

Let 

oo 

E{z)^z m l[E k {z/a n ). 

n=l 


To prove that E is entire, we repeat the argument in the proof of Theo¬ 
rem 4.1; we take into account that by Lemma 4.2 


- E k {z/a n )\ < c 


fc+i 


for all large n, 


and that the series |a n | _/c_1 converges. (Recall po < s < k 1.) More¬ 
over, E has the zeros of /, therefore f/E is holomorphic and nowhere 
vanishing. Hence 


M. 

E(z) 


二 e 9{z) 
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for some entire function g. By the fact that / has growth order po, and 
because of the estimate from below for E obtained in Corollary 5.4, we 
have 


e Re(g(z)) _ 


m 

E(z) 


< c f e clzl 


for |z| = r m . This proves that 


Re(g(z)) < C\z\ s , for ㈤ 一 r m . 


The proof of Hadamard’s theorem is therefore complete if we can estab¬ 
lish the following final lemma. 

Lemma 5.5 Suppose g is entire and u = Re(^) satisfies 
u(z) < Cr s whenever |z| = r 


for a sequence of positive real numbers r that tends to infinity. Then g 
is a polynomial of degree < s. 

Proof. We can expand ^ in a power series centered at the origin 


oo 

= 〉 ： CL n Z n . 
n=0 

We have already proved in the last section of Chapter 3 (as a simple 
application of Cauchy’s integral formulas) that 


1 /*27T r . 

⑹ 

By taking complex conjugates we find that 


if n > 0 
if n < 0. 


1 广 2 丌 _ 

(7) — / g(re ie )e~ ine d9 = 0 

2?r Jo 

whenever n > 0, and since 2u = g -\-~g we add equations (6) and (7) to 
obtain 

i r 27r 

a n r n = — u(re ie )e~ ine d9^ whenever n > 0. 

n Jo 

For n = 0 we can simply take real parts of both sides of (6) to find that 


i r 27r 

2Re(a 0 ) = — u(re l6 ) dO. 
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Now we recall the simple fact that whenever n _ 0, the integral of e~ in6 
over any circle centered at the origin vanishes. Therefore 

1 /* 2?r 

a n = - / \u(re ie ) — Cr s ]e~ ind d9 when n > 0, 

^r n J 0 

hence 

1 /* 27r 

\a n \ < ——/ [Cr s - u(re id )} d6 < 2Cr s ~ n - 2Re(ao)r~ n . 

冗 r n J 0 

Letting r tend to infinity along the sequence given in the hypothesis of 
the lemma proves that a n = 0 for n > s. This completes the proof of the 
lemma and of Hadamard’s theorem. 


6 Exercises 

1. Give another proof of Jensen’s formula in the unit disc using the functions 
(called Blaschke factors) 




- az 


[Hint: The function f /{^ Zl - - - ^z N ) is nowhere vanishing.] 


2. Find the order of growth of the following entire functions: 

(a) p(z) where p is a polynomial. 

(b) e bzn for b^O. 

⑷ e e ' 


3. Show that if r is fixed with Im(r) > 0, then the Jacobi theta function 

oo 

广 v/ I \ \' nin^T 2Trinz 

(S{z\r) = > e e 

n= —oo 

is of order 2 as a function of Further properties of ㊀ will be studied in Chap¬ 
ter 10. 

[Hint: —n 2 t + 2n\z\ < —n 2 t/2 when t > 0 and n > 4|z|/t.] 

4. Let t > 0 be given and fixed, and define F(z) by 

oo 

F{z) = JJ(l-e- 2 nnt e 2 ^ iz ). 

n=l 

Note that the product defines an entire function of 
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(a) Show that \F(z)\ < Ae a ^ z ^ 2 , hence F is of order 2. 

(b) F vanishes exactly when 之 =—int + m for n > 1 and n, m integers. Thus, 
if z n is an enumeration of these zeros we have 

^^F = 0 ° but ^^ <0 °- 


[Hint: To prove (a), write F(z) = Fi(z)F 2 (z) where 


N oo 

Fi{z) = ~[[{1 - e~ 2nnt e 2niz ) and F 2 {z) = {I - e ~ 2nnt e 2lTiz ). 

n=l n=N-\-l 

Choose N « c\z\ with c appropriately large. Then, since 

I ^ e -2—) ^\z\ ^ x ^ 

\iv+l / 

one has |-^ 2 (^)| < A. However, 

|1 - e _27rnt e 27riz | < 1 + e 27r|z| < 2e 2?r|z| . 

Thus |Fi( 2 ：)| < 2 N e 2nN ^ z ^ < e c 卜 1 2 . Note that a simple variant of the function F 
arises as a factor in the triple product formula for the Jacobi theta function ㊀, 
taken up in Chapter 10.] 

5. Show that if a > 1, then 

F a (z) = r e~ wa e 27rUt dt 

J —OO 

is an entire function of growth order a /(a — 1). 

[Hint: Show that 

_M! + 27r | 2 || t |< ckr /(-D 

by considering the two cases K— 1 $ A\z\ and K -1 > A\z\ y for an appropriate 
constant A.] 

6. Prove Wallis’s product formula 

7r 2 - 2 4 - 4 2m - 2m 

2 = T~3 ' 3~5 (2m - 1) . (2m + 1) … ' 

[Hint: Use the product formula for sin 2 : at z = 丌 /2.] 


7. Establish the following properties of infinite products. 
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(a) Show that if |a n | 2 converges, then the product ]^[(1 + a n ) converges to a 
non-zero limit if and only if a n converges. 

(b) Find an example of a sequence of complex numbers {a n } such that ^2 a n 
converges but Yl(l + a n ) diverges. 

(c) Also find an example such that 11(1 + a n) converges and a n diverges. 


8 . Prove that for every z the product below converges, and 

oo . 

cos(z/2) cos(2 ： /4) cos( 2 / 8 ) • • ■ = JJ cos(z/2 k ) = - S11 “ _ 

k=l 

[Hint: Use the fact that sin 2z = 2 sin 2 : cos 2 :.] 

9. Prove that if |z| < 1, then 

(l + z)(l + z 2 )(l + z 4 )(l+z s )--- = 6 ( 1+ /)=占 

k=0 

10 . Find the Hadamard products for: 

(a) e z - 1 ; 

(b) COS 7TZ. 

[Hint: The answers are e z ^ 2 z n^Li(^ + z 2 /4n 2 tv 2 ) and II 二 o(l _ 4^ 2 /(2n + l) 2 ), 
respectively.] 

11 . Show that if / is an entire function of finite order that omits two values, then 
f is constant. This result remains true for any entire function and is known as 
Picard’s little theorem. 

[Hint: If / misses a, then f(z) — a is of the form e p ( z ) where p is a polynomial.] 


12 . Suppose / is entire and never vanishes, and that none of the higher derivatives 
of f ever vanish. Prove that if / is also of finite order, then f(z) = e az+b for some 
constants a and b. 


13 . Show that the equation e z — z = 0 has infinitely many solutions in C. 

[Hint: Apply Hadamard’s theorem.] 

14 . Deduce from Hadamard’s theorem that if F is entire and of growth order p 
that is non-integral, then F has infinitely many zeros. 


15 . Prove that every meromorphic function in C is the quotient of two entire 
functions. Also, if {a n } and {b n } are two disjoint sequences having no finite limit 
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points, then there exists a meromorphic function in the whole complex plane that 
vanishes exactly at {a n } and has poles exactly at {b n }- 



Qn(z) =y^CkZ k 


are given polynomials for n = 1,2,.... Suppose also that we are given a sequence of 
complex numbers {a n } without limit points. Prove that there exists a meromorphic 
function f(z) whose only poles are at {a n }, and so that for each n, the difference 


m - q 



is holomorphic near a n . In other words, f has a prescribed poles and principal 
parts at each of these poles. This result is due to Mittag-Leffler. 


17. Given two countably infinite sequences of complex numbers {afc}£ 0 an d 
{bk}kLo, with limfc—oo \dk\ = oo, it is always possible to find an entire function F 
that satisfies F{ak) = bk for all k. 

(a) Given n distinct complex numbers ai,... ,a n , and another n complex num¬ 
bers bi,... ,b n , construct a polynomial P of degree < n — 1 with 

P(o>i) = bi for z = 1,..., n. 

(b) Let be a sequence of distinct complex numbers such that ao = 0 

and limfc_,oo \ak\ = oo, and let E(z) denote a Weierstrass product associated 
with {a/c}. Given complex numbers show that there exist integers 

rrik > 1 such that the series 

bo E(z) ^ b k E(z) 

E\z) z ^ E'{ak) z- ak \ak J 

defines an entire function that satisfies 

F(ak) = bk for all k > 0. 

This is known as the Pringsheim interpolation formula. 


7 Problems 

1. Prove that if / is holomorphic in the unit disc, bounded and not identically 
zero, and zi, 2 : 2 ,, z n ^... are its zeros (\zk\ < 1), then 


- | 2 „|) < 00 . 
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[Hint: Use Jensen’s formula.] 


2.* In this problem, we discuss Blaschke products, which are bounded analogues 
in the disc of the Weierstrass products for entire functions. 

(a) Show that for 0 < |a| < 1 and |z| < r < 1 the inequality 

a + \ol\z < 1 + r 
(1 — az)a ~ 1 — r 


holds. 

(b) Let {o; n } be a sequence in the unit disc such that a n # 0 for all n and 

oo 

5^(1 - |a n |) < oo. 


Note that this will be the case if {a n } are the zeros of a bounded holomorphic 
function on the unit disc (see Problem 1). Show that the product 


/(z) =r 


OLn — Z \(X n 

1 一 Oi n Z Oin 


converges uniformly for | 之 | < r < 1 ， and defines a holomorphic function on 
the unit disc having precisely the zeros a n and no other zeros. Show that 

i/wi < i- 


' u 

3. * Show that > —^ 7 - — is an entire function of order 1/a. 

乙 (n!) a ' 

4. * Let F(z) = a n z n be an entire function of finite order. Then the growth 

order of F is intimately linked with the growth of the coefficients a n as n —>• 00 . 
In fact: 

(a) Suppose < Ae a ^ P . Then 

( 8 ) limsup lan^^ri 1 ^ < 00 . 


(b) Conversely, if ( 8 ) holds, then |F(, 2 ：)| < A e e a ^ z ^ P+e , for every e > 0. 
[Hint: To prove (a), use Cauchy’s inequality 

A p 

I I 〆 n ar 卜 

W S ， 


and the fact that the function u -n e uP , 0 < it < p, attains its minimum value 
e n ^ p (p/n) n ^ p at ti = /p 1 ^. Then, choose r in terms of n to achieve this mini¬ 


mum. 
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To establish (b), note that for \z\ = r, 

隱蘇 


for some constant c, since n n > n\. This yields a reduction to Problem 3.] 





The Gamma and Zeta 
Functions 


It is no exaggeration to say that the gamma and zeta functions are 
among the most important nonelement ary functions in mathematics. 
The gamma function T is ubiquitous in nature. It arises in a host of 
calculations and is featured in a large number of identities that occur in 
analysis. Part of the explanation for this probably lies in the basic struc¬ 
tural property of the gamma function, which essentially characterizes 
it: l/r(s) is the (simplest) entire function 1 * which has zeros at exactly 
5 = 0, _ 1， _ 2， _. 

The zeta function ^ (whose study, like that of the gamma function, 
was initiated by Euler) plays a fundamental role in the analytic theory 
of numbers. Its intimate connection with prime numbers comes about 
via the identity for C( 5 ): 



where the product is over all primes. The behavior of ((s) for real 5 > 1, 
with s tending to 1, was used by Euler to prove that 1/p diverges, 
and a similar reasoning for L-functions is at the starting point of the 
proof of Dirichlefs theorem on primes in arithmetic progression, as we 
saw in Book I. 

While there is no difficulty in seeing that ((s) is well-defined (and 
analytic) when Re (5) > 1, it was Riemann who realized that the further 
study of primes was bound up with the analytic (in fact, meromorphic) 
continuation of ( into the rest of the complex plane. Beyond this, we also 
consider its remarkable functional equation, which reveals a symmetry 
about the line Re (5) = 1/2, and whose proof is based on a corresponding 
identity for the theta function. We also make a more detailed study of 
the growth of (^(s) near the line Re(s) = 1, which will be required in the 
proof of the prime number theorem given in the next chapter. 


1 In keeping with the standard notation of the subject, we denote by s (instead of z) 

the argument of the functions T and (^. 
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1 The gamma function 

For s > 0, the gamma function is defined by 

疒 oo 

(1) r(s) = / dt. 

JO 

The integral converges for each positive 5 because near t = 0 the func¬ 
tion t s ~ l is integrable, and for t large the convergence is guaranteed by 
the exponential decay of the integrand. These observations allow us to 
extend the domain of definition of T as follows. 


Proposition 1.1 The gamma function extends to an analytic function 
in the half-plane Re(s) > 0, and is still given there by the integral for¬ 
mula (1). 

Proof. It suffices to show that the integral defines a holomorphic 
function in every strip 


*5(5,m = {5 < Re(s) < M }, 


where 0 < 5 < M < oo. Note that if a denotes the real part of 5, then 
\e~ t t s ~ 1 \ = so that the integral 


r( 5 ) 



dt , 


which is defined by the limit lim e —o dt, converges for each 

s G S\m. For e > 0, let 


Fe{s) 


r*l/e 


e~H s - 


dt. 


By Theorem 5.4 in Chapter 2, the function F e is holomorphic in the 
strip Ss,m- By Theorem 5.2, also of Chapter 2, it suffices to show that 
F e converges uniformly to T on the strip S^,m. To see this, we first 
observe that 

/ »e / »oo 

|r(s) - F e (s)\ < / e~ t t a ~ 1 dt+ / dt. 

Jo Jl/e 


The first integral converges uniformly to 0, as e tends to 0 since it can 
be easily estimated by e s / 5 whenever 0 < e < 1. The second integral 
converges uniformly to 0 as well, since 



O 1 ㈣ C e _ t/2d “ o , 
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and the proof is complete. 

1.1 Analytic continuation 

Despite the fact that the integral defining T is not absolutely convergent 
for other values of s, we can go further and prove that there exists a 
meromorphic function defined on all of C that equals T in the half-plane 
Re(5) 〉 0. In the same sense as in Chapter 2, we say that this function 
is the analytic continuation 2 of T, and we therefore continue to denote it 

by r. 

To prove the asserted analytic extension to a meromorphic function, 
we need a lemma, which incidentally exhibits an important property 

of r. 

Lemma 1.2 IfRe(s) > 0, then 


( 2 ) 


r(5 + 1) = sT(s). 


As a consequence r(n + 1) = n! for n = 0,1,2,. 


Proof. Integrating by parts in the finite integrals gives 



and the desired formula (2) follows by letting e tend to 0, and noting 
that the left-hand side vanishes because e~ f t s —> 0 as t tends to 0 or oo. 
Now it suffices to check that 



and to apply (2) successively to find that r(n + 1) = n\. 

Formula (2) in the lemma is all we need to give a proof of the following 
theorem. 

Theorem 1.3 The function T(s) initially defined for Re(s) > 0 has an 
analytic continuation to a meromorphic function on C whose only sin¬ 
gularities are simple poles at the negative integers 5 = 0, —1,... • The 
residue ofT at s = —n is (—l) n /n!. 


2 Uniqueness of the analytic continuation is guaranteed since the complement of the 
poles of a meromorphic function forms a connected set. 
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Proof. It suffices to extend T to each half-plane Re(s) > — m, where 
m > 1 is an integer. For Re(s) > —1, we define 

s 

Since T(s + 1) is holomorphic in Re(s) > —1，we see that F\ is meromor- 
phic in that half-plane, with the only possible singularity a simple pole 
at 5 = 0. The fact that r(l) = 1 shows that F\ does in fact have a simple 
pole at 5 = 0 with residue 1. Moreover, if Re(5) > 0, then 




r( s + i) 


(s) 


by the previous lemma. So F\ extends r to a meromorphic function on 
the half-plane Re(5) > —1. We can now continue in this fashion by defin¬ 
ing a meromorphic F m for Re(s) > —m that agrees with T on Re(s) > 0. 
For Re(5) 〉 一 m, where m is an integer > 1, define 


F m {s) 


r(5 + m) 


(s + m — 1) (s + m — 2) … s 


The function is meromorphic in Re(s) > —m and has simple poles 
at s = 0, 一 1, —2 , …， —m + 1 with residues 


res __ F (s) = _ F (— n + m 」 _ 

s ~ n 171 (m — 1 — n)!(—1)(—2) - - - (—n) 

(m — n — 1)! 

(m — 1 — n)!(—1)(—2)... (—n) 

(-i) n _ 

n\ 

Successive applications of the lemma show that F m (s) = F(s) for Re(5) > 
0. By uniqueness, this also means that F m = for 1 < fc < m on the 
domain of definition of F^. Therefore, we have obtained the desired 
continuation of r. 

Remark. We have already proved that T(s + 1) = sT(s) whenever 
Re(s) >0. In fact, by analytic continuation, this formula remains true 
whenever 5 乂 0,— 1 ，一 2,..., that is, whenever s is not a pole of T. This 
is because both sides of the formula are holomorphic in the complement 
of the poles of T and are equal when Re(s) > 0. Actually, one can go 
further, and note that if s is a negative integer s = —n with n > 1, then 
both sides of the formula are infinite and moreover 


res s= _ n r(s + 1) = -n res s= _ n r(s). 
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Finally, note that when 5 = 0 we have r(l) = lim s —o <§r(s). 

An alternate proof of Theorem 1.3, which is interesting in its own right 
and whose ideas recur later, is obtained by splitting the integral for T(s) 
defined on Re(s) > 0 as follows: 

pi POO 

T{s ) 二 j e~h s ~ l dt + j dt. 

The integral on the far right defines an entire function; also expanding 
e _t in a power series and integrating term by term gives 


Therefore 



(-i) w 

n\{n + s) 


(3) r ⑷ 


(—i) r 


—tj.s—1 


^ o n!(n+ S ) 


dt for Re(s) > 0. 


Finally, the series defines a meromorphic function on C with poles at 
the negative integers and residue (—l) n /n! at s = —n. To prove this, we 
argue as follows. For a fixed i? > 0 we may split the sum in two parts 


(-l) n (-l) ra , (~l) w 

n\{n + s) n\{n + s) ^ n!(n + 5) ? 

where N is an integer chosen so that N > 2R. The first sum, which is 
finite, defines a meromorphic function in the disc \s\ < R with poles at 
the desired points and the correct residues. The second sum converges 
uniformly in that disc, hence defines a holomorphic function there, since 
n > N > 2R and |n + s| > i? imply 


(~i) w 

n\{n + s) 


< 


n\R 


Since R was arbitrary, we conclude that the series in (3) has the desired 
properties. 

In particular, the relation (3) now holds on all of C. 


1.2 Further properties of T 

The following identity reveals the symmetry of T about the line Re(s)= 

1 / 2 . 
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Theorem 1.4 For all s G C 7 


⑷ 


r ⑷ r(i- s ) 


Sm7TS 


Observe that r(l — s) has simple poles at the positive integers s = 
1,2, 3,…， so that r(5)r(l — 5) is a meromorphic function on C with 
simple poles at all the integers, a property also shared by 7r/ siutts. 

To prove the identity, it suffices to do so for 0 < 5 < 1 since it then 
holds on all of C by analytic continuation. 

/ »oo a—1 

I V 7T 

Lemma 1.5 For 0 < a < 1 ， / - - dv 


lo 1 + u 


sm 7ra 


Proof. We observe first that 


i-i 


1 -\-v 


• dv 


l-he x 


dx. 


which follows by making the change of variables v = e x . However, using 
contour integration, we saw in Example 2 of Section 2.1 in Chapter 3, 
that the second integral equals 7r/sin7ra, as desired. 

To establish the theorem, we first note that for 0 < 5 < 1 we may write 

/»CO / »oo 

r(l — s)= e~ u u~ s du = t e~ vt {yt)~ s dv , 

Jo Jo 

where for t > 0 we made the change of variables vt = u. This trick then 
gives 


r(l-s)r(5)= / s)dt 

Jo 


—tj.s—1 


e~ vt {vt)~ s dv ) dt 



/o Jo 
v~ 


1 -\-v 


e~ t[1+v] v~ s dvdt 


dv 


sin 7r(l — s) 


sin its 


and the theorem is proved. 
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In particular, by putting s = 1/2, and noting that T(s) > 0 whenever 
5 > 0, we find that 


r(i/ 2 ) = 


We continue our study of the gamma function by considering its recip¬ 
rocal, which turns out to be an entire function with remarkably simple 
properties. 

Theorem 1.6 The function T has the following properties: 

(i) l/r(5) is an entire function of s with simple zeros at s = 0 ,— 1 ,- 2 ,... 
and it vanishes nowhere else. 

(ii) l/r(s) has growth 

丄 S CieC 2 | ， g M. 

r ㈤ 

Therefore, 1/T is of order 1 in the sense that for every e > 0 ， there 
exists a bound c(e) so that 


1 


< c(e)e^\ 1+ \ 


Proof. By the theorem we may write 


( 5 ) 


1 r(i- s )™ 


r ㈤ 


so the simple poles of T(1 — 5 ), which are at s = 1,2,3, … are cancelled 
by the simple zeros of sin7T5, and therefore 1/T is entire with simple zeros 
cit s = 0, 一 1, 一 2, — 3， • • 

To prove the estimate, we begin by showing that 

e~H a dt < e (<7+1) 10 咖 + 1 ) 



whenever a = Re(s) is positive. Choose n so that cr < n < a 1. Then 



dt < 



e~H n dt 


=n\ 

< n n 

― g n log n 

< e (o-+l)log(o-+l) 
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Since the relation (3) holds on all of C, we see from (5) that 


r ⑻ 


片 (-i) n ' 

^ n!(n+ i_ s) ^ 


sin 7rs 


7T 


However, from our previous observation, 



sin ns 

7T 




e~H^ dt < 


e (|o-| + l)log(|o-| + l) 


and because | sin7T5| < e 7r ’ s l (by Euler’s formula for the sine function) 
we find that the second term in the formula for l/r(s) is dominated by 
ce(l s l +1 ) log (l s l +1 )e 7r l s l, which is itself majorized by cie C2 卜 I log l s l. Next, we 
consider the term 


oo 


E 

n=0 


(—l) n sin 7T5 
n!(n + 1 — 5 ) 7r 


There are two cases: |Im(s)| > 1 and |Im(s)| < 1. In the first case, this 
expression is dominated in absolute value by ce 7r l s l. If |Im(<s)| < 1, we 
choose k to be the integer so that k — 1/2 < Re(5) < fc + 1/2. Then if 
^ > 1, 


00 

E 

n=0 


(—l) n sin its 
n!(n + 1 — 5 ) 7r 


( sin7rg I 

V ) {k - l)\{k - S)7l 


+ E (- 1 ) 

n^k—l 


sin7rs 

n\{n + 1 — s)tt 


Both terms on the right are bounded; the first because sin7T5 vanishes 
at s = A:, and the second because the sum is majorized by l/n\. 

When < 0, then Re(s) < 1/2 by our supposition, and n! (U— s ) 

is again bounded by c ^ 1/n!. This concludes the proof of the theorem. 

The fact that 1/r satisfies the type of growth conditions discussed in 
Chapter 5 leads naturally to the product formula for the function 1/r, 
which we treat next. 


Theorem 1.7 For all 5 G C 7 


1 



e~ s/n 
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The real number 7 , which is known as Euler’s constant, is defined 
by 


N x 

7 = lim > - log N. 

n^oo 丄 J n 
n=l 


The existence of the limit was already proved in Proposition 3.10, Chap¬ 
ter 8 of Book I, but we shall repeat the argument here for completeness. 
Observe that 


N 


N 







and by the mean value theorem applied to f(x) = 1/x we have 

———< -^r for all n < a; < n + 1 . 
n x ~ n 2. 

Hence 

)1 N—l 1 

T ， n~ l ° gN ^^ an+ N 

n=l n=l 

where \a n \ < 1/n 2 . Therefore a n converges, which proves that the 
limit defining 7 exists. We may now proceed with the proof of the fac¬ 
torization of i/r. 

Proof. By the Hadamard factorization theorem and the fact that 1 /T 
is entire, of growth order 1, and has simple zeros at 5 = 0,—1,-2, …， we 
can expand 1/T in a Weierstrass product of the form 


1 


= e As+B s 



e~ s/n 


Here A and B are two constants that are to be determined. Remembering 
that sT(s) —^ 1 as 5 —^ 0, we find that B = 0 (or some integer multiple 
of 2ni, which of course gives the same result). Putting 5 = 1, and using 
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the fact that r(l) = 1 yields 



A \ - 

71=1 \ 



-1/n 


lim e E:=i[ lo g( 1 + 1 / n )-V n ] 


N—^oo 


lim e 

N—oo 


-(En=i l/n)+logiV+log(l+l/iV) 


Therefore A = 7 + 2irik for some integer k. Since r(s) is real whenever 
5 is real, we must have k = 0 : and the argument is complete. 

Note that the proof shows that the function 1/T is essentially char¬ 
acterized (up to two normalizing constants) as the entire function that 
has: 

(i) simple zeros at s = 0 ,- 1 , — 2 ,... and vanishes nowhere else, and 

(ii) order of growth < 1 . 

Observe that sin ns has a similar characterization (except the zeros are 
now at all the integers). However, while sin ns has a stricter growth esti¬ 
mate of the form sin ns = O (e c l s l), this estimate (without the logarithm 
in the exponent) does not hold for l/r(s) as Exercise 12 demonstrates. 

2 The zeta function 

The Riemann zeta function is initially defined for real s > 1 by the 
convergent series 



As in the case of the gamma function, ^ can be continued into the com¬ 
plex plane. There are several proofs of this fact, and we present in the 
next section the one that relies on the functional equation of 


2.1 Functional equation and analytic continuation 

In parallel to the gamma function, we first provide a simple extension of 
(to a half-plane in C. 
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Proposition 2.1 The series defining (^{s) converges for Re(s) > 1, and 
the function ^ is holomorphic in this half-plane. 

Proof. li s = a it where a and t are real, then 

\n~ s \ = |e- slog?l | = e - alosn = n~ a . 

As a consequence, if cr > 1 + 5 > 1 the series defining ^ is uniformly 
bounded by l/n 1+s , which converges. Therefore, the series l/n s 

converges uniformly on every half-plane Re(s) > 1 + 5 > 1， and therefore 
defines a holomorphic function in Re(s) > 1. 

The analytic continuation of ^ to a meromorphic function in C is more 
subtle than in the case of the gamma function. The proof we present 
here relates to T and another important function. 

Consider the theta function, already introduced in Chapter 4, which 
is defined for real t > 0 by 




E 


n=—oo 


An application of the Poisson summation formula (Theorem 2.4 in Chap¬ 
ter 4) gave the functional equation satisfied by 汐 ， namely 


The growth and decay of i) we shall need are 

i}(t) < Ct— V 2 as t > 0, 

and 

I 汐⑺ —1| < Ce~ nt for some C > 0, and all t>l. 

The inequality for t tending to zero follows from the functional equation, 
while the behavior as t tends to infinity follows from the fact that 

^2 e - ™ 2 * < ^2 e ~ nnt ^ Ce_7rt 

n>l n>l 


for t > 1. 

We are now in a position to prove an important relation among C, T 
and d. 
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Theorem 2.2 //Re(s) 〉 1 ， then 

7T _S / 2 r(s/2)C(S) = 7 ： [ 1A( S / 2 ) — 1 [ 以 (ti) — 1] dw. 

2 Jo 

Proof. This and further arguments are based on the observation that 

/ »oo 

(6) / e -^ 2 n u (s/2)-i du ^ ^-V2 r ( s / 2 )n- s , if n > 1. 

Jo 

Indeed, if we make the change of variables u = t/nn 2 in the integral, the 
left-hand side becomes 

(J e~4 (s/2)_1 ctt) (nn 2 )~ s/2 , 

which is precisely 7r~ s ^ 2 T(s/2)n~ s . Next, note that 


^(u) — 1 

~ 2 ~ 


E' 


The estimates for 汐 given before the statement of the theorem justify an 
interchange of the infinite sum with the integral, and thus 


2 



以 0 /2)-1^(^) _ 1] 也 = 



u (s/2)-l e -7rn 2 u 


du 


二 7T~ 3/2 r(s/2)y^n~ s 

n=l 

= vr_ s / 2 r( S /2)C ( S )， 


as was to be shown. 

In view of this, we now consider the modification of the function 
called the xi function, which makes the former appear more symmetric. 
It is defined for Re(s) > 1 by 

⑺ 咖 ) = 7r_ s / 2 r( s /2)c( s )_ 

Theorem 2.3 The function ^ is holomorphic for He(5) > 1 and has an 
analytic continuation to all of C as a meromorphic function with simple 
poles at s = 0 and 5 = 1. Moreover, 


4(5) = ((I — s) for all 5 G C. 
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Proof. The idea of the proof is to use the functional equation for 汐 , 
namely 


oo oo 

J2 e-^^u- 1 / 2 el" 2 /' ^>0. 

n=—oo n=—oo 


We then could multiply both sides by w( s / 2 )— 1 and try to integrate in 
u. Disregarding the terms corresponding to n = 0 (which produce infini¬ 
ties in both sums), we would get the desired equality once we invoked 
formula (6), and the parallel formula obtained by making the change of 
variables u h 1/u. The actual proof requires a little more work and goes 
as follows. 

Let ^(u) = [^(u) — l]/2. The functional equation for the theta func¬ 
tion, namely ^(u) = implies 

岭 (u) = u~ 1/2 -il ； (l/u) + ^71 — 


Now, by Theorem 2.2 for Re(s) > 1, we have 


7r- 々 2 r( s /2)c( s ) 


u (s/2)-1 必⑷心 


du + / zi( s / 2 ) _1/ 0(u) chi 


,(s/2)_l 




1 


2U 1 / 2 2 


du + 


u^ s ^ 2 ^~ lf ip(u) du 


i i r°° 

^---+ / 卜 (I / 2 )-" 2 + u^- 1 ) ^{u)du 

_ 丄 s /i 


whenever Re(s) > 1. Therefore 


4(5) = — — - + /* (w( _s / 2 ) _1 / 2 + w( s / 2 ) _1 ) ^(u) du. 

s — 1 s 


Since the function ^ has exponential decay at infinity, the integral above 
defines an entire function, and we conclude that ^ has an analytic con¬ 
tinuation to all of C with simple poles at 5 = 0 and 5=1. Moreover, it is 
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immediate that the integral remains unchanged if we replace 5 by 1 — s, 
and the same is true for the sum of the two terms l/(s — 1) — I/ 5 . We 
conclude that ^( 5 ) = ^(1 — s) as was to be shown. 


From the identity we have proved for ^ we obtain the desired result for 
the zeta function: its analytic continuation and its functional equation. 

Theorem 2.4 The zeta function has a meromorphic continuation into 
the entire complex plane, whose only singularity is a simple pole at s = 1 • 

Proof. A look at (7) provides the meromorphic continuation of 
namely 


C(s) 




Recall that l/T(s/2) is entire with simple zeros at 0, —2, —4,..so the 
simple pole of ^(s) at the origin is cancelled by the corresponding zero 
of l/r(s/2). As a consequence, the only singularity of ^ is a simple pole 
at s = 1. 

We shall now present a more elementary approach to the analytic 
continuation of the zeta function, which easily leads to its extension in 
the half-plane Re ⑷ > 0. This method will be useful in studying the 
growth properties of ^ near the line Re(5) = 1 (which will be needed in 
the next chapter). The idea behind it is to compare the sum n~ s 

with the integral x_ s dx. 

Proposition 2.5 There is a sequence of entire functions {Jn(*§)}^=i 
that satisfy the estimate |5 n (5)| < |5|/n CT+1 ，where s = cr + it，and such 
that 

⑻ EE 〜⑷， 

l<n<AT J1 l<n<N 


whenever N is an integer > 1. 

This proposition has the following consequence. 

Corollary 2.6 For Re (5) > 0 we have 

C(s) - ^ H(s), 

s —丄 

where H{s) = d n (s) is holomorphic in the half-plane Re(s) > 0. 
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To prove the proposition we compare ^ 1<n<iV n _s with 

El<„<JV In +1 X ~ S dx ^ and Set 


⑼ 


S n {s) 


/*n+l 

' 1 

1 " 

L 

n s 

X s 


dx. 


The mean-value theorem applied to f(x) = x~ s yields 


丄_丄 

n s x s _ n a+1 


whenever n < x < n 1. 


Therefore |5 n (s)| 幺 |s|/n CT+1 , and since 




dx 

X s 


the proposition is proved. 

Turning to the corollary, we assume first that Re(«s) > 1. We let N 
tend to infinity in formula (8) of the proposition, and observe that by the 
estimate |5 n (5)| < \s\/n a+1 we have the uniform convergence of the se¬ 
ries ^2 S n (s) (in any half-plane Re(s) > 6 when 5 > 0). Since Re(s) > 1, 
the series n~ s converges to C( 5 ), and this proves the assertion when 
Re(s) > 1. The uniform convergence also shows that ^2S n (s) is holo- 
morphic when Re(s) > 0, and thus shows that is extendable to that 
half-plane, and that the identity continues to hold there. 

Remark. The idea described above can be developed step by step to 
yield the continuation of into the entire complex plane, as shown in 
Problems 2 and 3. Another argument giving the full analytic continua¬ 
tion of C is outlined in Exercises 15 and 16. 

As an application of the proposition we can show that the growth 
of (^(s) near the line Re( 5 ) = 1 is “mild.” Recall that when Re(s) > 1 ， 
we have \C( S )\ ^ n_ a , and so ((s) is bounded in any half-plane 

Re( 5 ) >1 + 5, with 5 > 0. We shall see that on the line Re ㈤ =1, \C(s)\ 
is majorized by |t| e , for every e > 0, and that the growth near the line is 
not much worse. The estimates below are not optimal. In fact, they are 
rather crude but suffice for what is needed later on. 


Proposition 2.7 Suppose s = a it with cr, t G M. Then for each ctq, 
0 < cr 0 < 1 7 and every e > 0 ， there exists a constant c e so that 


(i) |C( 5 )I < c e |t| 1_o ' 0+ % if a 0 <a and \t\ > 1. 
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(ii) |^( 5 )| < c e \t\ e f ifl<a and \t\ > 1. 

In particular, the proposition implies that C(1 + it) = 0(\t\ e ) as \t\ 
tends to infinity, 3 and the same estimate also holds for For the proof, 
we use Corollary 2.6. Recall the estimate |<5 n (5)| < |s|/n a+1 . We also 
have the estimate |5 n (5)| < 2/n 0- , which follows from the expression for 
5 n (s) given by (9) and the fact that \n~ s \ = and |x _s | < n~ G if 
x > n. We then combine these two estimates for |5 n (<s)| via the observa¬ 
tion that A = A S A 1 ~ S , to obtain the bound 


l^n(s)| < 



1-S 


< 


豐 . 


as long as 5 > 0. Now choose 5 = 1 — a 0 + e and apply the identity in 
Corollary 2.6. Then, with a = Re(s) > <jq, we find 


IC(^)I < 


1 


+ 2|s| 


l-cro+e 〉: 


,1+e 


and conclusion (i) is proved. The second conclusion is actually a conse¬ 
quence of the first by a slight modification of Exercise 8 in Chapter 2. For 
completeness we sketch the argument. By the Cauchy integral formula, 

i r 2vr 

C(s) = —y C(s + re ld )e ie d6, 

where the integration is taken over a circle of radius r centered at the 
point 5. Now choose r = e and observe that this circle lies in the half¬ 
plane Re(5) > 1 — e, and so (ii) follows as a consequence of (i) on replac¬ 
ing 2e by e. 


3 Exercises 


1. Prove that 


r ⑷ 


=lim 


n s n\ 

s(s + 1 ) … （s + n) 


whenever s _ 0 , — 1 , — 2 ,.... 

[Hint: Use the product formula for 1/r, and the definition of the Euler constant 7 .] 


2. Prove that 

t - r 7i{ri a b) — r(a + i)r(6 + 1) 
(n + a)(n+&) _ r(a + fe + 1) 


3 The reader should recall the O notation which was introduced at the end of Chapter 1. 
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whenever a and b are positive. Using the product formula for sin 7 rs, 
proof that r(s)r(l — s) = n/ sin 7 rs. 


3. Show that Wallis’s product formula can be written as 

2 2 n (n !) 2 


— =lim 

2 n —hdo (2?1 + 1)! 


( 2 n + 1 ) 1/2 . 


As a result, prove the following identity: 

r(s)r(s + 1 / 2 ) = v^2 1_2s r(2s). 


4. Prove that if we take 

f( z )= ( 卜 1 中 ， for |z| < 1 

(defined in terms of the principal branch of the logarithm), where 
complex number, then 

oo 

f( z ) = ^2 a » n 

n=0 


with 


a n (a) ~ 


1 

rR 


as n —> oo. 


5. Use the fact that r(s)r(l — s) = n/ sin ns to prove that 

/ 27T 

|r(l/2 + it)\ = e ， t + e _ nt , whenever t G M. 

6 . Show that 

1 + 5 + 臺 + 〜 + ^ 1 _ 备 10871 — 1 +1082 , 

where 7 is Euler’s constant. 

7. The Beta function is defined for Re(a) > 0 and Re(/3) > 0 by 

B(a,/3) = ( (1 - t) 01 - 1 #— 1 dt. 

Jo 

(a) Prove that 


give another 


a is a fixed 
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(b) Show that B(a,3) = / - - - - - - du. 

' 'Jo (l+w ) a+/3 

[Hint: For part (a), note that 

r(a)r(/3) = [ [ V _1 e _t_s dtds, 

Jo Jo 

and make the change of variables s = ur, t = u(l — r).] 


8 . The Bessel functions arise in the study of spherical symmetries and the Fourier 
transform. See Chapter 6 in Book I. Prove that the following power series identity 
holds for Bessel functions of real order v > —1/2: 


ju (*^) = 


{x/ 2 Y r 

r(z/ + l/2)-\/7r 



t 2y-(l/2) dt = 



(-ir(4) m 

m\T(u + m + 1 ) 


whenever a: > 0. In particular, the Bessel function J u satisfies the ordinary differ¬ 
ential equation 


d 2 J u 

dx 2 


1 dJ v 
x dx 



Ju = 0 . 


[Hint: Expand the exponential e lx£ in a power series, and express the remaining 
integrals in terms of the gamma function, using Exercise 7.] 

9. The hypergeometric series F(a, /3, 7 ; z) was defined in Exercise 16 of Chapter 1. 
Show that 

戰 p) = 

Here a > 0, /3 > 0 ,7 > /3, and \z\ < 1. 

Show as a result that the hypergeometric function, initially defined by a power 
series convergent in the unit disc, can be continued analytically to the complex 
plane slit along the half-line [ 1 , 00 ). 

Note that 

log(l - z) = -2 ： F(1,1,2 ； 2：), 

e z = lim^^oo F(l, (3, 1; z//3), 

[Hint: To prove the integral identity, expand (1 — zt)~ a as a power series.] 


10. An integral of the form 


F{z)m 「 dt 
Jo 
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is called a Mellin transform, and we shall write M(f)(z) = F{z). For example, 
the gamma function is the Mellin transform of the function e _t . 

(a) Prove that 


A^(cos)(^) = f cos(t)t ;z_1 dt = r( 2 ：) cos ( 7 i 


and 


•M(sin)( 2 ) 


sin(/ ： )t 2：_1 dt = r( 2 ) sin ( 丌 ; 


for 0 < Re ⑷ < 1, 


for 0 < Re(z) < 1. 


(b) Show that the second of the above identities is valid in the larger strip 
—1 < Ke(z) < 1 , and that as a consequence, one has 


dx : 


7T 

2 


and 




This generalizes the calculation in Exercise 2 of Chapter 2. 

[Hint: For the first part, consider the integral of the function f(w) = e~ w w z ~ 1 
around the contour illustrated in Figure 1. Use analytic continuation to prove the 
second part.] 



Figure 1 . The contour in Exercise 10 


11. Let f(z) = e az e~ eZ where a > 0. Observe that in the strip {x -\- iy : \y\ < 7 r} 
the function f{x + iy) is exponentially decreasing as |a:| tends to infinity. Prove 
that 

/(O = r(a + iC), for all ^ £ R. 


12 . This exercise gives two simple observations about 1/T. 
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(a) Show that l/|r(5)| is not 0(e c l s l) for any c > 0. [Hint: If s = —k — 1/2, 
where /c is a positive integer, then |l/r(s)| > kl/n.] 

(b) Show that there is no entire function F(s) with F(s) = 0(e c l s l) that has 
simple zeros at s = 0, — 1 , —2,..., —n, .. and that vanishes nowhere else. 


13. Prove that 

flogr(g) _]_ 

ds 2 (s + n) 2 

n=0 

whenever s is a positive number. Show that if the left-hand side is interpreted 
as (r’/r )’， then the above formula also holds for all complex numbers s with 
s 0, —1, —2,.... 


14. This exercise gives an asymptotic formula for log n\. A more refined asymptotic 
formula for r(s) as s —>• oo (Stirling’s formula) is given in Appendix A. 

(a) Show that 

d r^ 1 

/ logr(t) dt = logx, for x > 0, 
dx J x 

and as a result 

/ x + l 

log r(t) dt = x log x — x -\- c. 

: 

(b) Show as a consequence that logT(n) ~ nlogn as n —> oo. In fact, prove 
that logT(n) ~ nlog n + 0(n) as n ^ oo. [Hint: Use the fact that r(x) is 
monotonically increasing for all large x.\ 


15. Prove that for Re(s) > 1, 


C(«) = 




dx. 


[Hint: Write l/(e x - 1) = e_nX .] 


16. Use the previous exercise to give another proof that ^(s) is continuable in the 
complex plane with only singularity a simple pole at s = 1. 

[Hint: Write 




e x -1 


dx + 


爾 




e x -1 


dx. 
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The second integral defines an entire function, while 
r 1 T s _i R 

/ _ _ dx=Y^ _ — _ 

J 0 e x -l + 

where B m denotes the m th Bernoulli number defined by 

oo 

^ \ 、 -Dm m 

- 7 = / - 7^ • 

e x — 1 ^ m\ 

m=0 

Then Bo = 1, and since z/(e z — 1) is holomorphic for \z\ < we must have 
limsup m — 〜 |-B m /rn!| 1/m = 1/2tt.] 

17. Let / be an indefinitely differentiable function on R that has compact support, 
or more generally, let / belong to the Schwartz space. 4 Consider 

= fjyj J Q f(x)x~ 1+3 dx. 

(a) Observe that I(s) is holomorphic for Re(s) > 0. Prove that I has an analytic 
continuation as an entire function in the complex plane. 

(b) Prove that 7(0) = 0, and more generally 

1(—n) = (—l) n /( n+1 ) (0) for all n > 0. 

[Hint: To prove the analytic continuation, as well as the formulas in the second 
part, integrate by parts to show that I(s) = fo° f ⑻ ( x ) xS + k_1 ^ x -] 


4 Problems 

1. This problem provides further estimates for ^ and ^ near Re(s) = 1. 

(a) Use Proposition 2.5 and its corollary to prove 

__ 7VTS —1 __ 

C( s ) = n ~ s ~ E 夂 ㈤ 

l<n<N n>N 

for every integer N > 2, whenever Re(s) > 0. 

(b) Show that |^(1 -\- it)\ = 0(log|t|), as \t\ — >• oo by using the previous result 
with N = greatest integer in |t|. 


4 The Schwartz space on R is denoted by S and consists of all indefinitely differentiable 
functions /, so that / and all its derivatives decay faster than any polynomials. In other 
words, sup^^jg |rr| m |/C)(x)| < oo for all integers m, ^ > 0. This space appeared in the 
study of the Fourier transform in Book I. 
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(c) The second conclusion of Proposition 2.7 can be similarly refined. 

(d) Show that if t # 0 and t is fixed, then the partial sums of the series 

l/ n：L+lt are bounded, but the series does not converge. 


2* Prove that for Re(s) > 0 


C(«) = 



w 

x s+1 


dx 


where {x} is the fractional part of x. 


3.* If Q{x) = {a:} — 1/2, then we can write the expression in the previous problem 
as 






Q(x) 

x s+1 


dx. 


Let us construct Qk(x) recursively so that 



dx — 0 , 


d( ^ k+1 = Q k (x), Qo(x) = Q(x) 


and Q k (x + 1) = Qk{x). 


Then we can write 


cw = 





dx , 


and a A:-fold integration by parts gives the analytic continuation for C(s) when 
Re(s) > —k. 


4.* The functions Qk in the previous problem are related to the Bernoulli polyno¬ 
mials Bk{x) by the formula 


Qk{x) 


Bk+i{x) 


(k + l)\ 

Also, if /c is a positive integer, then 

2C(2fc) = (-1) 


for 0 < a: < 1. 


fc+i 


(2ft)! 


•B2k, 


where Bk = Bk(0) are the Bernoulli numbers. For the definition of Bk(x) and Bk 
see Chapter 3 in Book I. 










The Zeta Function and Prime 
Number Theorem 


Bernhard Riemann, whose extraordinary intuitive pow¬ 
ers we have already mentioned, has especially reno¬ 
vated our knowledge of the distribution of prime num¬ 
bers, also one of the most mysterious questions in 
mathematics. He has taught us to deduce results in 
that line from considerations borrowed from the in¬ 
tegral calculus: more precisely, from the study of a 
certain quantity, a function of a variable s which may 
assume not only real, but also imaginary values. He 
proved some important properties of that function, 
but enunciated two or three as important ones with¬ 
out giving the proof. At the death of Riemann, a note 
was found among his papers, saying “These properties 
of C(s) (the function in question) are deduced from an 
expression of it which, however, I did not succeed in 
simplifying enough to publish it. 55 

We still have not the slightest idea of what the 
expression could be. As to the properties he simply 
enunciated, some thirty years elapsed before I was able 
to prove all of them but one. The question concern¬ 
ing that last one remains unsolved as yet, though, by 
an immense labor pursued throughout this last half 
century, some highly interesting discoveries in that di¬ 
rection have been achieved. It seems more and more 
probable, but still not at all certain, that the “Rie- 
mann hypothesis” is true. 

J. Hadamard, 1945 


Euler found, through his product formula for the zeta function, a 
deep connection between analytical methods and arithmetic properties 
of numbers, in particular primes. An easy consequence of Euler’s for¬ 
mula is that the sum of the reciprocals of all primes, 1/p, diverges, 
a result that quantifies the fact that there are infinitely many prime 
numbers. The natural problem then becomes that of understanding 
how these primes are distributed. With this in mind, we consider the 
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following function: 

tt(x) = number of primes less than or equal to x. 


The erratic growth of the function n(x) gives little hope of finding a 
simple formula for it. Instead, one is led to study the asymptotic behavior 
of 7r(x) as x becomes large. About 60 years after Euler’s discovery, 
Legendre and Gauss observed after numerous calculations that it was 
likely that 

(1) 7T ⑷〜 T~—— as x —> oo. 

\ogx 

(The asymptotic relation /(:r) 〜 g(x) as x oo means that 
f(x)/g(x) —> 1 as > oo.) Another 60 years later, shortly before Rie- 
mann’s work, Tchebychev proved by elementary methods (and in partic¬ 
ular, without the zeta function) the weaker result that 


( 2 ) 


7r(x) ^ - - as x —> oo. 

logx 


Here, by definition, the symbol ^ means that there are positive constants 
A < B such that 


A 


x 

logo; 


< 7r(x) < B - - 

logx 


for all sufficiently large x. 

In 1896, about 40 years after Tchebychev^ result, Hadamard and de 
la Vallee Poussin gave a proof of the validity of the relation (1). Their 
result is known as the prime number theorem. The original proofs of 
this theorem, as well as the one we give below, use complex analysis. 
We should remark that since then other proofs have been found, some 
depending on complex analysis, and others more elementary in nature. 

At the heart of the proof of the prime number theorem that we give 
below lies the fact that (^(s) does not vanish on the line Re(s) = 1. In 
fact, it can be shown that these two propositions are equivalent. 


1 Zeros of the zeta function 


We have seen in Theorem 1.10, Chapter 8 in Book I, Euler’s identity, 
which states that for Re(s) > 1 the zeta function can be expressed as an 
infinite product 


c(s)= n 

p 
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For the sake of completeness we provide a proof of the above identity. 
The key observation is that 1/(1 — p~ s ) can be written as a convergent 
(geometric) power series 



and taking formally the product of these series over all primes yields 
the desired result. A precise argument goes as follows. 

Suppose M and N are positive integers with M > N. Observe now 
that, by the fundamental theorem of arithmetic, 1 any positive integer 
n < N can be written uniquely as a product of primes, and that each 
prime that occurs in the product must be less than or equal to N and 
repeated less than M times. Therefore 



Letting N tend to infinity in the series now yields 



For the reverse inequality, we argue as follows. Again, by the fundamen¬ 
tal theorem of arithmetic, we find that 



Letting M tend to infinity gives 



1 A proof of this elementary (but essential) fact is given in the first section of Chapter 8 
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Hence 



and the proof of the product formula for ^ is complete. 

From the product formula we see, by Proposition 3.1 in Chapter 5, 
that ("(s) does not vanish when Re(s) 〉 1. 

To obtain further information about the location of the zeros of we 
use the functional equation that provided the analytic continuation of 
We may write the fundamental relation ^( 5 ) = ^(1 — s) in the form 

vr- s / 2 r( S /2)C(s) = - S )/2)C(1 - s), 


and therefore 

c ( 一 -v 2 ^^ca — s )_ 

Now observe that for Re(5) < 0 the following are true: 

(i) C(1 — s ) has no zeros because Re(l — 5 ) > 1. 

(ii) r((l — 5 )/ 2 ) is zero free. 

(iii) l/T(s/2) has zeros at 5 = —2, —4, —6, …. 

Therefore, the only zeros of C in Re(5) < 0 are located at the negative 
even integers —2, —4, —6, — 

This proves the following theorem. 


Theorem 1.1 The only zeros of ( outside the strip 0 < Re(s) < 1 are 
at the negative even integers, —2, —4, — 6 , .... 

The region that remains to be studied is called the critical strip, 
0 < Re(s) < 1. A key fact in the proof of the prime number theorem is 
that C has no zeros on the line Re (5) = 1 . As a simple consequence of 
this fact and the functional equation, it follows that ^ has no zeros on 
the line Re(s) = 0. 

In the seminal paper where Riemann introduced the analytic contin¬ 
uation of the function and proved its functional equation, he applied 
these insights to the theory of prime numbers, and wrote down “ex¬ 
plicit” formulas for determining the distribution of primes. While he did 
not succeed in fully proving and exploiting his assertions, he did initiate 
many important new ideas. His analysis led him to believe the truth of 
what has since been called the Riemann hypothesis: 
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The zeros of C(s) in the critical strip lie on the line 
Re( 5 ) = 1/2. 

He said about this: “It would certainly be desirable to have a rigorous 
demonstration of this proposition; nevertheless I have for the moment 
set this aside, after several quick but unsuccessful attempts, because it 
seemed unneeded for the immediate goal of my study.Although much 
of the theory and numerical results point to the validity of this hypothe¬ 
sis, a proof or a counter-example remains to be discovered. The Riemann 
hypothesis is today one of mathematics’ most famous unresolved prob¬ 
lems. 

In particular, it is for this reason that the zeros of ^ located outside the 
critical strip are sometimes called the trivial zeros of the zeta function. 
See also Exercise 5 for an argument proving that ^ has no zeros on the 
real segment, 0 < cr < 1 , where s = a it. 

In the rest of this section we shall restrict ourselves to proving the 
following theorem, together with related estimates on which we shall 
use in the proof of the prime number theorem. 

Theorem 1.2 The zeta function has no zeros on the line Re(s) = 1. 

Of course, since we know that ^ has a pole at 5 = 1 ， there are no zeros 
in a neighborhood of this point, but what we need is the deeper property 
that 


C(1 + it) 7 ^ 0 for all t G R. 

The next sequence of lemmas gathers the necessary ingredients for the 
proof of Theorem 1.2. 

Lemma 1.3 If Re( 5 ) 1 ， theTi 

__ jy—ms 00 

iogc(s)= y] — = y] c n n~ s 

p,m n=l 

for some c n > 0. 

Proof. Suppose first that 5 > 1. Taking the logarithm of the Euler 
product formula, and using the power series expansion for the logarithm 
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which holds for 0 < x < 1, we find that 


logC(s)=logJ] T -— 

p 



E: 


m 


p,m 


Since the double sum converges absolutely, we need not specify the order 
of summation. See the Note at the end of this chapter. The formula 
then holds for all Re(s) > 1 by analytic continuation. Note that, by 
Theorem 6.2 in Chapter 3, log ((5) is well defined in the simply connected 
half-plane Re(5) > 1， since ( has no zeros there. Finally, it is clear that 
we have 


E 


V 


—ms 


m 


p,m 


= > : C n Tl S , 
n=l 


where c n = 1/m ii n = p m and c n = 0 otherwise. 

The proof of the theorem we shall give depends on a simple trick that 
is based on the following inequality. 

Lemma 1.4 // 0 G IR ? then 3 + 4 cos 6 + cos 29 > 0. 

This follows at once from the simple observation 

3 + 4 cos 9 + cos 20 = 2(1 + cos 6) 2 . 


Corollary 1.5 If a > 1 and t is real, then 

log |C 3 (cr)C 4 (cr + it)((a + 2it)\ > 0. 

Proof. Let s = a it and note that 

Re(n _s ) = Re(e _ ( CT+lt ) logn ) = e _£jlogn cos(t logn) = n~ a cos(tlog n). 
Therefore, 

log |C 3 (cr)C 4 (o ■ + + 2“)| 

= 31og|C(cr)| +41og|C(cr + it)| + log|C(cr + 2it)| 

= 3Re[log C(a)] + 4Re[logC(^ + it)] + Re[log C(a + 2it)\ 

=Z^c n n _£T (3 + 4 cos 6 n + cos 29 n ), 

where 6 n = t log n. The positivity now follows from Lemma 1.4, and the 
fact that c n > 0. 
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We can now finish the proof of our theorem. 

Proof of Theorem 1.2. Suppose on the contrary that C(1 + ^o) = 0 for 
some 7 ^ 0. Since is holomorphic at 1 + ito^ it must vanish at least to 
order 1 at this point, hence 

|C((j + it 0 )\ 4 < C(a — l ) 4 as a —> 1 , 

for some constant C > 0. Also, we know that 5 = 1 is a simple pole for 
C(s), so that 

|C (^)| 3 < C'{a - l)^ 3 as cr ^ 1, 

for some constant C f > 0. Finally, since C is holomorphic at the points 
cr + 2Uq, the quantity |^(a + 2ito)\ remains bounded as a ^ 1. Putting 
these facts together yields 

IC 3 ( f 7 )C 4 ( (J + + 2 zt)| —>• 0 as cr —>• 1 , 

which contradicts Corollary 1.5, since the logarithm of real numbers be¬ 
tween 0 and 1 is negative. This concludes the proof that C is zero free 
on the real line Re(s) = 1. 

1.1 Estimates for 1/C(*§) 

The proof of the prime number theorem relies on detailed manipulations 
of the zeta function near the line Re( 5 ) = 1; the basic object involved is 
the logarithmic derivative C ， ( 5 )/C( 5 ) - For this reason, besides the non¬ 
vanishing of ^ on the line, we need to know about the growth of 
and 1/(. The former was dealt with in Proposition 2.7 of Chapter 6 ; we 
now treat the latter. 

The proposition that follows is actually a quantitative version of The¬ 
orem 1 . 2 . 

Proposition 1.6 For every e > 0 ， we have 1 /|^( 5 )| < c e \t\ e when s = 
a + it, cr > 1， and \t\ > 1. 

Proof. From our previous observations, we clearly have that 
IC 3 ( <J )C 4 (^ 7 + ^)C( cr + 2 zt)| > 1 , whenever a > 1. 

Using the estimate for ^ in Proposition 2.7 of Chapter 6 , we find that 
|C 4 (^ + ^)I > c|r 3 (a)||t|- e >c , (a- l) 3 |i|- £ , 
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for all a > 1 and \t\ > 1. Thus 

(3) |C(<^ + ^)1 ^ c’(a — l) 3 / 4 |t「 e / 4 , whenever a > 1 and \t\ > 1. 

We now consider two separate cases, depending on whether the in¬ 
equality a — 1 > A|t|— 5e holds, for some appropriate constant A (whose 
value we choose later). 

If this inequality does hold, then (3) immediately provides 
\Q{a + it)\>A'\t\~^, 

and it suffices to replace 4e by e to conclude the proof of the desired 
estimate, in this case. 

If, however, a — 1 < A|t|_ 5e , then we first select a r > a with a f — 1 = 
A|t「 5e . The triangle inequality then implies 

|C((T + it)\ > [C(o ■’ + it)\ - +it) - C(cr + it)|, 

and an application of the mean value theorem, together with the esti¬ 
mates for the derivative of ^ obtained in the previous chapter, give 

|C(cr / + it) — ((a + it)\ < c ,r \a f — a\ \t\ e < c r, \a f — 1| \t\ e . 

These observations, together with an application of (3) where we set 
a = show that 

|C(<7 + ^)| > c’(a’ - l) 3/4 |t「 e/4 — c"(cr’ — l)|f| e . 

Now choose A = (c’/(2c")) 4 , and recall that a' — \ = A|t|— 5e . This gives 
precisely 

C V - l)3/4| t |-e/4 = 2c "(〆—!)|^ ? 

and therefore 

\C(a + it)\ > A"\t\~ 4e . 

On replacing 4e by e, the desired inequality is established, and the proof 
of the proposition is complete. 


2 Reduction to the functions ^ and 也 

In his study of primes, Tchebychev introduced an auxiliary function 
whose behavior is to a large extent equivalent to the asymptotic distri¬ 
bution of primes, but which is easier to manipulate than n(x). Tcheby¬ 
chev 9 s ^-function is defined by 

寸 (X) 二 V] logp. 
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The sum is taken over those integers of the form p m that are less than or 
equal to x. Here p is a prime number and m is a positive integer. There 
are two other formulations of ^ that we shall need. First, if we define 


八 (n)= 
then it is clear that 


logp if n = p 171 for some prime p and some m > 1, 
0 otherwise, 




A ㈤. 

l<n<.x 


Also, it is immediate that 


必 ㈤ =X] 

p<x 


logx 

logp 


logp 


where [u] denotes the greatest integer < u, and the sum is taken over the 
primes less than x. This formula follows from the fact that if p 171 < x, 
then m < log a;/ logp. 

The fact that 寸 (x) contains enough information about 7r(x) to prove 
our theorem is given a precise meaning in the statement of the next 
proposition. In particular, this reduces the prime number theorem to a 
corresponding asymptotic statement about 

Proposition 2.1 If 寸 ⑻ 〜 x as x —> oo 7 then 7r(x) ~ x/ logx as 

X —> CXD. 

Proof. The argument here is elementary. By definition, it suffices to 
prove the following two inequalities: 

(4) 1 < lim inf 7r(x) 1 and lim sup 7r(x)^ Qfa X <1. 

x ~ >OG X x ― ^oo ^ 

To do so, first note that crude estimates give 

^{ x ) = yz logp< V] logp = ?r(a:) logX, 

V logp V logp 

p<x J PS X 


and dividing through by x yields 

☆ (X ) 〈 7r(x) logX 


X 


X 
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The asymptotic condition ^(x) ~ x implies the first inequality in (4). 
The proof of the second inequality is a little trickier. Fix 0 < a < 1, and 
note that 

^(a；) > ^logp > ^ logp > (n(x) - Tr(x a )) log x a , 

p<：x x a <p<.x 


and therefore 


^(x) + a7r(x a ) logx > a7r(x) logx. 


Dividing by x, noting that 7r(x a ) < x a , a < 1, and ^(x) ~ x, gives 
1 > alim sup tt(x) X . 


Since a < 1 was arbitrary, the proof is complete. 

Remark. The converse of the proposition is also true: if n(x) ~ 
x / log x then 々(x) 〜 x. Since we shall not need this result, we leave the 
proof to the interested reader. 

In fact, it will be more convenient to work with a close cousin of the 
^ function. Define the function by 

f x 

^i(x) = / ^(u) du. 

In the previous proposition we reduced the prime number theorem to 
the asymptotics of as x tends to infinity. Next, we show that this 
follows from the asymptotics of 

Proposition 2.2 If 咕 i(x) 〜 x 2 /2 as x ^ oo, then ^(x) ~ x as x — oo, 
and therefore 7r(x) ~ x/ logx as x oo. 

Proof. By Proposition 2.1, it suffices to prove that ^(x) 〜 a: as 
x —>• oo. This will follow quite easily from the fact that if a < 1 < /?, 
then 

i r i f 0x 

The proof of this double inequality is immediate and relies simply on the 
fact that ^ is increasing. As a consequence, we find, for example, that 

m [^i(^) - * ⑷], 

(P - 1)^ 
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and therefore 


H x ) < 1 ry>i(/3x) 2 _ 

x ~ (/? — 1) (/3x) 2 x 2 . 

In turn this implies 

Since this result is true for all /3 > 1, we have proved that 
lim ^{x) / x < 1. A similar argument with a < 1， then shows 

that liminf^—.oo-0(x)/x > 1, and the proof of the proposition is com¬ 
plete. 

It is now time to relate (and therefore also and We proved in 
Lemma 1.3 that for Re(s ) 〉 1 


lim sup ^<-1 - 

x ― >-oo M — 1 


logC(s ) 二 


E 

m,p 


P 


—ms 


m 


Differentiating this expression gives 

V ^ m,P 


CXJ A / \ 

A( n ) 

n s 


n=l 


We record this formula for Re(s) > 1 as 


The asymptotic behavior 畛 i(x) 〜 x 2 /2 will be a consequence via (5) 
of the relationship between and which is expressed by the following 
noteworthy integral formula. 


Proposition 2.3 For all c > 1 


⑹ 


岭 1 ㈤ 


2ni 



x s+1 
s(s + 1) 



ds. 


To make the proof of this formula clear, we isolate the necessary con¬ 
tour integrals in a lemma. 
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Lemma 2.4 If c > 0, then 

1 f c+io ° a s f 0 i/o < a < 1, 

2 丌 " c -ioo s(s+l) S 1 1-1/a ifl<a. 

Here, the integral is over the vertical line Re(s) = c. 

Proof. First note that since |a s | = a c , the integral converges. We 
suppose first that 1 < a, and write a = e^ 3 with /? = log a > 0. Let 

Then res s =of = 1 and res s= _i/ = —1/a. For T > 0, consider the path 
r(T) shown on Figure 1. 



Figure 1. The contour in the proof of Lemma 2.4 when a > 1 


The path r(T) consists of the vertical segment S(T) from c — iT to 
c + iT, and of the half-circle C(T) centered at c of radius T, lying to the 
left of the vertical segment. We equip r(T) with the positive (counter¬ 
clockwise) orientation, and note that we are dealing with a toy contour. 
If we choose T so large that 0 and —1 are contained in the interior of 
r(T), then by the residue formula 


2iri 


'r(cr) 


f(s) ds = 1 — 1/a. 
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Since 


,r(T) 


f(s) ds 


'S(T) 


f(s)ds + 


'em 


f(s) ds, 


it suffices to prove that the integral over the half-circle goes to 0 as T 
tends to infinity. Note that if 5 = a + zt G C(T), then for all large T we 
have 


| S ( S + 1)|>(1/2)T 2 , 


and since cr < c we also have the estimate \e^ s \ < e^ c . Therefore 


^(T) 


f(s) ds 


< 


as T 4 oo, 


and the case when a > 1 is proved. 

If 0 < a < 1, consider an analogous contour but with the half-circle 
lying to the right of the line Re(s) = c. Noting that there are no poles in 
the interior of that contour, we can give an argument similar to the one 
given above to show that the integral over the half-circle also goes to 0 
as T tends to infinity. 

We are now ready to prove Proposition 2.3. First, observe that 


岭 ( U ) : 

n=l 


where f n (u) = 1 M n <u and f n {u) = 0 otherwise. Therefore, 


and hence 


^i(x)= 



ip(u) du 



K{u)f n {u)du 



du. 


^i(x) = A(n)(x — n). 

n<x 
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This fact, together with equation (5) and an application of Lemma 2.4 
(with a = x/n), gives 

丄广 A ㈦ 丄 r +ioo M^ ds 

2vrz L ioo s(s+l) V C(^) ) ^ { 、 nil_ ioo s(s+l) d 

-x^A(n) (l-^) 

n<x 

as was to be shown. 


2.1 Proof of the asymptotics for 也 

In this section, we will show that 

^i(x) ~ x 2 /2 as x —> oo, 


and as a consequence, we will have proved the prime number theorem. 
The key ingredients in the argument are: 

• the formula in Proposition 2.3 connecting 也 to namely 

c， ⑷' 


^l(x) 


r c-\-ioo x s-\-l 


27r * Jc-ioo S(S+1) 


C(s) 


ds 


for c > 1. 

the non-vanishing of the zeta function on Re(s) = 1, 
C(1 +it) ^ 0 for all t G M, 


and the estimates for C near that line given in Proposition 2.7 of 
Chapter 6 together with Proposition 1.6 of this chapter. 

Let us now discuss our strategy in more detail. In the integral above 
for ^i{x) we want to change the line of integration Re(s) = c with c > 1, 
to Re(s) = 1. If we could achieve that, the size of the factor x s+1 in the 
integrand would then be of order x 2 (which is close to what we want) 
instead of x c+1 , c > 1, which is much too large. However, there would 
still be two issues that must be dealt with. The first is the pole of ^(s) 
at 5 = 1; it turns out that when it is taken into account, its contribution 
is exactly the main term x 2 /2 of the asymptotic of ^i(x). Second, what 
remains must be shown to be essentially smaller than this term, and so 
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we must further refine the crude estimate of order x 2 when integrating 
on the line Re(5) = 1. We carry out our plan as follows. 

Fix c > 1, say c = 2, and assume x is also fixed for the moment with 
x>2. Let F(s) denote the integrand 


F(s) = 


x s+1 
s(s + 1) 



First we deform the vertical line from c — zoo to c + ioo to the path 7 (T) 
shown in Figure 2. (The segments of ^(T) on the line Re(s) = 1 consist 
of T < t < oo, and —oo < t < —T.) Here T > 3, and T will be chosen 
appropriately large later. 


c + ioo 


Re(s) = c 


1 + ioo 


1 — ioo 


liT) 


751 


74 


73t 


1 + ioo 


72 


7it 


1 — ioo 




Figure 2. Three stages: the line Re(s) = c, the contours ^(T) and 


The usual and familiar arguments using Cauchy’s theorem allow us to 
see that 


⑺ 



F{s) ds 


2ni 


/ F(s) ds. 
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Indeed, we know on the basis of Proposition 2.7 in Chapter 6 and Proposi¬ 
tion 1.6 that |C’(s)/((5)| $ for any fixed rj > 0, whenever s = a + it, 
a > 1, and \t\ > 1. Thus |F(s)| < A f \t\~ 2+ri in the two (infinite) rectan¬ 
gles bounded by the line (c — ioo, c + ioo) and 7 (T). Since F is regular in 
that region, and its decrease at infinity is rapid enough, the assertion (7) 
is established. 

Next, we pass from the contour 7 (T) to the contour 7(T ， 5). (Again, 
see Figure 2.) For fixed T, we choose 5 > 0 small enough so that ( has 
no zeros in the box 


{5 = a + it, 1 — S < a < 1, \t\ < T}. 


Such a choice can be made since does not vanish on the line a = 1. 

Now F(s) has a simple pole at 8 = 1. In fact, by Corollary 2.6 in Chap¬ 
ter 6, we know that ((s) = l/(s — 1) + H(s), where H(s) is regular near 
5 = 1. Hence — C ， ( 5 )/C( 5 ) = 1/(5 — 1) + h(s), where h(s) is holomorphic 
near s = 1， and so the residue of F(s) at 5 = 1 equals x 2 /2. As a result 


2ni 




F ㈤ ds 二 g 


r S + l 


We now decompose the contour j(T, S) as 71+72 + 73 + 74+75 and 
estimate each of the integrals F(s) ds, j = 1 ， 2,3, 4,5, with the 7 ^- as 
in Figure 2. 

First we contend that there exists T so large that 



e 9 

9 


and 



F(s) ds 


< 



To see this, we first note that for 5 G 71 one has 






Then, by Proposition 1.6 we have, for example, that |C’(5)/C ⑷ | S -A|t| 1//2 , 
so 



F(s) ds 


< Cx 2 



N 1/2 

t 2 


dt. 


Since the integral converges, we can make the right-hand side < ex 2 /2 
upon taking T sufficiently large. The argument for the integral over 75 
is the same. 
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Having now fixed T, we choose 5 appropriately small. On 73 , note 
that 


A+ s \ 


A+l-5 


2-<5 


from which we conclude that there exists a constant Ct (dependent on 
T) such that 



F(s) ds 


< C T x 2 — 5 _ 


Finally, on the small horizontal segment 72 (and similarly on 74 ), we can 
estimate the integral as follows: 


logo; 

We conclude that there exist constants Ct and C’ T (possibly different 
from the ones above) such that 


F(s) ds 


<C^ x 1+a da < C' T 


n-8 


^i( x ) - < ex 2 + C T X 2 ~ 5 -\-C r T X 


logx 


Dividing through by x 2 /2, we see that 


2^i (x) 


< 2 e + 2C t x~ 5 + 2C r T 


logx 


and therefore, for all large x we have 


2 岭 1 ⑻ _ 1 


< 4e. 


This concludes the proof that 

#1 ⑷〜 x 2 /2 as x —> 00 . 


and thus, we have also completed the proof of the prime number theorem. 


Note on interchanging double sums 

We prove the following facts about the interchange of infinite sums: if {aki\i<k^<oo 
is a sequence of complex numbers indexed by N x N, such that 
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then: 

(i) The double sum A = a ^) summed in this order converges, and 

we may in fact also interchange the order of summation, so that 

oo oo oo oo 

A = J2J2 au = J2Yl ake - 

k=l i=\ t=\ k=l 


(ii) Given e > 0, there is a positive integer N so that for all K,L > N we have 

— Ylk=i X^=i < e. 

(iii) If m h (k(m),£(m)) is a bijection from N to N x N, and if we write c m = 
^k(m)£(m) ? tllGIl A = k=l Ck. 

Statement (iii) says that any rearrangement of the sequence {aki} can be summed 
without changing the limit. This is analogous to the case of absolutely convergent 
series, which can be summed in any desired order. 

The condition (8) says that each sum aki converges absolutely, and moreover 
this convergence is “uniform” in /c. An analogous situation arises for sequences of 
functions, where an important question is whether or not the interchange of limits 

lim lim f n (x) = lim lim f n (x) 

x — n — ^oo n ― >oo x — *xq 

holds. It is a well-known fact that if the / n ’s are continuous, and their convergence 
is uniform, then the above identity is true since the limit function is itself continu¬ 
ous. To take advantage of this fact, define bk = \ a k(.\ and let S = {rro, xi, .. .} 

be a countable set of points with lim n _,oo x n = xq. Also, define functions on S as 
follows: 


fk(x 0 ) : 

V—V oo 

for k = 1,2,... 

fk(x n ) ■■ 

\—\n 

= 1^£=1 

for k = 1,2,... and n = 1,2, 

g(x) ■- 

= Er =1 九 ㈤ 

for : r G 


By assumption (8), each fk is continuous at xq. Moreover \fk(x)\ < bk and 
^2bk < oo, so the series defining the function g is uniformly convergent on 5, 
and therefore g is also continuous at xq. As a consequence we find (i), since 


^ = g(x 0 ) = lim g(x n ) = lim V" 

* ■■ » ^ n, ― * oo n, — > oo • * « 




lim olm = ^2/ ^2 / clm. 

i=\ k=l 


e=i k=i 

For the second statement, first observe that 


K L 

A — 〉:〉: aki 
k=l £=1 


< X! + XIX] i a ^i- 

k<K t>L k>K £=1 
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To estimate the second term, we use the fact that ^ bk converges, which implies 
^2 k>K [] 二 1 \c^m\ < e/2 whenever K > Ko, for some Kq. For the first term above, 
note that Yl k ^ K \^m\ < Sfcli l a ^l- But the argument above guar¬ 
antees that we can interchange these last two sums; also SfcLi l a w| < 

so that for all L > Lo we have \ a ki\ < e/2. Taking N > max(Lo, Kq) 

completes the proof of (ii). 

The proof of (iii) is a direct consequence of (ii). Indeed, given any rectangle 

R(K,L) = {(k, ^ N x N : 1 < k < K and 1 < £ < L}, 

there exists M such that the image of [1, M] under the map m i-^ (/c(m),£(m)) 
contains R(K, L). 

When U denotes any open set in R 2 that contains the origin, we define for R > 0 
its dilate U(R) = {y : y = Rx for some x € U}, and we can apply (ii) to see 
that 


A = lim a k £. 

R^oo 

(k,£)eU(R) 

In other words, under condition (8) the double sum aki can be evaluated by 
summing over discs, squares, rectangles, ellipses, etc. 

Finally, we leave the reader with the instructive task of finding a sequence of 
complex numbers {a^} such that 


k £ £ k 

[Hint: Consider {aki\ as the entries of an infinite matrix with 0 above the diagonal, 
—1 on the diagonal, and = 2 £ ~ k if k > £.] 


3 Exercises 

1. Suppose that {a n }S=i is a sequence of real numbers such that the partial sums 

-An = CLl CLn 

are bounded. Prove that the Dirichlet series 



converges for Re(s) > 0 and defines a holomorphic function in this half-plane. 
[Hint: Use summation by parts to compare the original (non-absolutely convergent) 
series to the (absolutely convergent) series ^2 A n (n~ s — (n + l) _s ). An estimate 
for the term in parentheses is provided by the mean value theorem. To prove 
that the series is analytic, show that the partial sums converge uniformly on every 
compact subset of the half-plane Re(s) > 0.] 
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2. The following links the multiplication of Dirichlet series with the divisibility 
properties of their coefficients. 

(a) Show that if {a m } and {bk} are two bounded sequences of complex numbers, 
then 

(e^) wherec ™= x 一 a - hk - 

The above series converge absolutely when Re(s) > 1. 

(b) Prove as a consequence that one has 

= ^ 考 

^ n s ^ n s 

n=l n=l 

for Re(s) > 1 and Re(s — a) > 1, respectively. Here d(n) equals the number 
of divisors of n, and a a (n) is the sum of the a th powers of divisors of n. In 
particular, one has cro(n) = d(n). 

3. In line with the previous exercise, we consider the Dirichlet series for 1/(. 

(a) Prove that for Re(s) > 1 

1 —子 

C(s) T n s 

3 \ / n=l 

where "(n) is the Mobius function defined by 

! 1 if n = 1, 

(—l) fc if n = pi • • ■ pk, and the pj are distinct primes, 

0 otherwise. 

Note that fi(nm) = fi(n)fi(m) whenever n and m are relatively prime. [Hint: 
Use the Euler product formula for C( s ).] 

(b) Show that 

= J 1 if 几 =1 ， 

乙 "W - 1 o otherwise. 

lc\-n K 


4. Suppose {a n }^Li is a sequence of complex numbers such that a n = if n 三 m 
mod q for some positive integer q. Define the Dirichlet L-series associated to 
W} by 

oo 

L(s )= 二 for Re(s) > 1. 
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Also, with ao = a q , let 

g—l 

Q(x) = ^2 CLq-me^. 

m=0 

Show, as in Exercises 15 and 16 of the previous chapter, that 

i(s ) = f^)J 0 Q e^~ 1 dx, for Re(s) > 1 - 

Prove as a result that L(s) is continuable into the complex plane, with the only 
possible singularity a pole at s = 1. In fact, L{s) is regular at s = 1 if and only if 
^2^2o = 0. Note the connection with the Dirichlet L(s, x) series, taken up in 
Book I, Chapter 8, and that as a consequence, L(s,x) is regular at s = 1 if and 
only if x is a non-trivial character. 


5. Consider the following function 


c>) = i — 


2 1 + 3 1 _ ' 


E 


(—i 广 +i 

n s 


(a) Prove that the series defining <^(s) converges for Re(s) > 0 and defines a 
holomorphic function in that half-plane. 

(b) Show that for s > 1 one has <^(s) = (1 — 2 1-S )^(s). 

(c) Conclude, since ^ is given as an alternating series, that ( has no zeros on 
the segment 0 < cr < 1. Extend this last assertion to cr = 0 by using the 
functional equation. 


6. Show that for every c > 0 

i rc+iN j f 1 if a > 1, 

lim - ~： / a s — = < 1/2 if a = 1, 

iv-oo 2m J c _ iN s I 0 if o < a < L 

The integral is taken over the vertical segment from c — iN to c + iN. 

7. Show that the function 

as) = n- s/2 F(s/2)C(s) 
is real when s is real, or when Re(s) = 1/2. 

8. The function f has infinitely many zeros in the critical strip. This can be seen 
as follows. 
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(a) Let 

F(s) = $(1/2 + s), where ^(s) = 7t~ s ^ 2 T(s/2)^(s). 

Show that F(s) is an even function of s, and as a result, there exists G so 
that G(s 2 ) = F(s). 

(b) Show that the function (s — l)^(s) is an entire function of growth order 1, 
that is 

\(s-l)C(s)\<A e e a ^\ 

As a consequence G(s) is of growth order 1/2. 

(c) Deduce from the above that C has infinitely many zeros in the critical strip. 

[Hint: To prove (a) and (b) use the functional equation for C( s ). For (c), use a 
result of Hadamard, which states that an entire function with fractional order has 
infinitely many zeros (Exercise 14 in Chapter 5).] 

9. Refine the estimates in Proposition 2.7 in Chapter 6 and Proposition 1.6 to 
show that 

(a) |C(1 + it)| < Alog\t\, 

(b) |C，(1 + 的 |SA(logM) 2 , 

(c) 1/|C(1 +^)| < ^(log|t|) a , 
when \t\ > 2 (with a = 7). 


10. In the theory of primes, a better approximation to n(x) (instead of a;/logo;) 
turns out to be Li(rr) defined by 


(a) Prove that 


Li(x) 


Li(x) = ——+ 0 


dt 

logt. 


logrr \ (log x) 2 
and that as a consequence 

n(x) ~ Li (: r) 


as x — oo, 


as x ^ oo. 


[Hint: Integrate by parts in the definition of Li(x) and observe that it suffices 
to prove 


dt 


O 


(logt) 2 V( lo g^) 


2 . 


To see this, split the integral from 2 to y/x and from y/x to x.] 
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(b) Refine the previous analysis by showing that for every integer TV > 0 one 
has the following asymptotic expansion 

Li(x：) = + (logi) 2 + 2 (log a:) 3 " ' + (iV _ 1)! (k^p + °((loga;) JV + 1 

as a: —^ oo. 

11. Let 

咖 ) =^2 iogp 

where the sum is taken over all primes < x. Prove that the following are equivalent 
as x — oo: 

(i) ip{x) - x, 

(ii) 7r (: r)r/logx, 

(iii) ~ x, 

(iv) ^i(x) ~ x 2 /2. 

12. If p n denotes the n th prime, the prime number theorem implies that 
p n ^ n log n as n —>■ oo. 

(a) Show that tv(x) ~ x / log x implies that 

log n(x) + log log t 〜 log 

(b) As a consequence, prove that log7r(x) ~ logx, and take x = p n to conclude 
the proof. 

4 Problems 

1. Let F(s) = a n /Ti s , where \a n \ < M for all n. 

(a) Then 

t - /: | f ( ct + 沟 i 2 贞 =£ 餐 if > h 

—J n=l 

How is this reminiscent of the Parseval-Plancherel theorem? See e.g. Chap¬ 
ter 3 in Book I. 
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(b) Show as a consequence the uniqueness of Dirichlet series: If F(s) = 

where the coefficients are assumed to satisfy \a n \ < cn k for some k, and 
F(s) = 0, then a n = 0 for all n. 


Hint: For part (a) use the fact that 


2T 



/ \ —cr —it it 

[nm) n m 


dt 



0 


if n = m, 
\i n ^ m. 


2.* One of the “explicit formulas” in the theory of primes is as follows: if ipi is the 
integrated Tchebychev function considered in Section 2, then 


^) = T-E^Ty-^) 


where the sum is taken over all zeros p of the zeta function in the critical strip. 
The error term is given by E{x) = c\x + Co + SfcLi ^~ 2k /{^k(2k — 1)), where 
ci = C ， (0)/C(0) an d co = C’ （一 1)/C(—1). Note that l/|/9| 1+e < oo for every 
e > 0, because (1 — s)C(s) has order of growth 1. (See Exercise 8.) Also, obvi¬ 
ously E(x) = 0(x) as a: ^ oo. 


3.* Using the previous problem one can show that 

7r(x) — Li(x) = 0{x a+e ) as a: —• oo 


for every e > 0, where a is fixed and l/2<a<lif and only if has no zeros in 
the strip a < Re(s) < 1. The case a = 1/2 corresponds to the Riemann hypothesis. 


4.* One can combine ideas from the prime number theorem with the proof of 
Dirichlet’s theorem about primes in arithmetic progression (given in Book I) to 
prove the following. Let q and i be relatively prime integers. We consider the 
primes belonging to the arithmetic progression {qk + and let 7r g ^(x) denote 

the number of such primes < x. Then one has 


cr 

^ q A x ) ~ - as 丨 —• oo , 

(p(q)logx 

where (p(g) denotes the number of positive integers less than q and relatively prime 
to q. 




Conformal Mappings 


The results I found for polygons can be extended un¬ 
der very general assumptions. I have undertaken this 
research because it is a step towards a deeper un¬ 
derstanding of the mapping problem, for which not 
much has happened since Riemann’s inaugural disser¬ 
tation; this, even though the theory of mappings, with 
its close connection with the fundamental theorems of 
Riemann’s function theory, deserves in the highest de¬ 
gree to be developed further. 

E. B. Christoff el, 1870 


The problems and ideas we present in this chapter are more geomet¬ 
ric in nature than the ones we have seen so far. In fact, here we will 
be primarily interested in mapping properties of holomorphic functions. 
In particular, most of our results will be “global,” as opposed to the 
more “local” analytical results proved in the first three chapters. The 
motivation behind much of our presentation lies in the following simple 
question: 


Given two open sets U and V in C, does there exist a holo¬ 
morphic bijection between them? 

By a holomorphic bijection we simply mean a function that is both 
holomorphic and bijective. (It will turn out that the inverse map is then 
automatically holomorphic.) A solution to this problem would permit 
a transfer of questions about analytic functions from one open set with 
little geometric structure to another with possibly more useful properties. 
The prime example consists in taking = D the unit disc, where many 
ideas have been developed to study analytic functions. 1 In fact, since 
the disc seems to be the most fruitful choice for V we are led to a variant 
of the above question: 


Given an open subset f] of C, what conditions on f] guarantee 
that there exists a holomorphic bijection from Cl to D? 


1 For the corresponding problem when V = C, the solution is trivial: only U = C is 
possible. See Exercise 14 in Chapter 3. 
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In some instances when a bijection exists it can be given by explicit 
formulas, and we turn to this aspect of the theory first. For example, the 
upper half-plane can be mapped by a holomorphic bijection to the disc, 
and this is given by a fractional linear transformation. From there, one 
can construct many other examples, by composing simple maps already 
encountered earlier, such as rational functions, trigonometric functions, 
logarithms, etc. As an application, we discuss the consequence of these 
constructions to the solution of the Dirichlet problem for the Laplacian 
in some particular domains. 

Next, we pass from the specific examples to prove the first general 
result of the chapter, namely the Schwarz lemma, with an immediate 
application to the determination of all holomorphic bijections ( u auto- 
morphisms” of the disc to itself). These are again given by fractional 
linear transformations. 

Then comes the heart of the matter: the Riemann mapping theorem, 
which states that can be mapped to the unit disc whenever it is simply 
connected and not all of C. This is a remarkable theorem, since little 
is assumed about fi, not even regularity of its boundary dCt. (After 
all, the boundary of the disc is smooth.) In particular, the interiors of 
triangles, squares, and in fact any polygon can be mapped via a bijective 
holomorphic function to the disc. A precise description of the mapping 
in the case of polygons, called the S chwar z- C hr ist off el formula, will be 
taken up in the last section of the chapter. It is interesting to note that 
the mapping functions for rectangles are given by “elliptic integrals,” and 
these lead to doubly-periodic functions. The latter are the subject of the 
next chapter. 


1 Conformal equivalence and examples 

We fix some terminology that we shall use in the rest of this chapter. 
A bijective holomorphic function / : U —> F is called a conformal map 
or biholomorphism. Given such a mapping /, we say that U and V 
are conformally equivalent or simply biholomorphic . An important 
fact is that the inverse of / is then automatically holomorphic. 


Proposition 1.1 If f : U ^ V is holomorphic and injective, then 
f{z) 0 for all z E U. In particular, the inverse of f defined on its 
range is holomorphic, and thus the inverse of a conformal map is also 
holomorphic. 
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Proof. We argue by contradiction, and suppose that /’( 之 。） = 0 for 
some zo G U. Then 

f(z) — f(zo) = a(z — zo) k + G(z) for all 2 ： near zo, 

with a ^ 0, k >2 and G vanishing to order fc + 1 at zq. For sufficiently 
small w, we write 

f(z) — f(zo) — w = F(z) + G(z), where F(z) = a(z — zo) k — w. 


Since |G( 2 ；)| < |^( 2 ：)| on a small circle centered at 之 0 ， and F has at 
least two zeros inside that circle, Rouche 5 s theorem implies that f(z) — 
f(zo) — w has at least two zeros there. Since /’( 之 ) — 0 for all z + z 。 but 
sufficiently close to 2 ：o it follows that the roots of f(z) — f(zo) — w are 
distinct, hence / is not injective, a contradiction. 

Now let g = / _1 denote the inverse of / on its range, which we can 
assume is V. Suppose wq and w is close to wq. Write w = f(z) and 
wo = f(zo). If w ^ wo, we have 

- 9{^o) _ 1 _ 1 


W — Wo 


, w ~ w o _ _ f(z)-f(z 0 ) * 

g{w)—gywQ ) z—zq 


Since f f (zo) ^ 0, we may let z Zo and conclude that g is holomorphic 
at w 0 with g'(w 0 ) m l/f(g(w 0 )). 


From this proposition we conclude that two open sets U and V are 
conformally equivalent if and only if there exist holomorphic functions 
/：[/—>• V and g : V ^ U such that g(f(z)) = z and f(g(w)) = w for all 
z CU and w EV. 

We point out that the terminology adopted here is not universal. Some 
authors call a holomorphic map f : U ^ V conformal if /’(：） 笋 0 for all 
z 6 U. This definition is clearly less restrictive than ours; for example, 
f{z) = z 2 on the punctured disc C — {0} satisfies f\z) ^ 0, but is not 
injective. However, the condition f(z) 笋 0 is tantamount to / being a 
local bijection (Exercise 1). There is a geometric consequence of the con¬ 
dition f f (z) 7 ^ 0 and it is at the root of this discrepency of terminology in 
the definitions. A holomorphic map that satisfies this condition preserves 
angles. Loosely speaking, if two curves 7 and 77 intersect at zo, and a is 
the oriented angle between the tangent vectors to these curves, then the 
image curves / o 7 and for] intersect at /( 之 0 ), and their tangent vectors 
form the same angle a. Problem 2 develops this idea. 


We begin our study of conformal mappings by looking at a number 
of specific examples. The first gives the conformal equivalence between 
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the unit disc and the upper half-plane, which plays an important role in 
many problems. 


1.1 The disc and upper half-plane 

The upper half-plane, which we denote by H, consists of those complex 
numbers with positive imaginary part; that is, 

H = {z G C : Im(z) > 0}. 

A remarkable fact, which at first seems surprising, is that the unbounded 
set H is conformally equivalent to the unit disc. Moreover, an explicit 
formula giving this equivalence exists. Indeed, let 

F{z) = ^ and 

i-\- z 1 -\-w 

Theorem 1.2 The map F : H —>• D is a conformal map with inverse 


Proof. First we observe that both maps are holomorphic in their 
respective domains. Then we note that any point in the upper half¬ 
plane is closer to i than to —z, so |F( 2 ：)| < 1 and F maps H into D. To 
prove that G maps into the upper half-plane, we must compute Im(G(^)) 
for i/; G O. To this end we let ^ w + iv, and note that 


Im(G ㈣ ）=Re 


Re 


1 — \ 

1 u iv J 

(1 — u — ^)(1 -\-u — iv ) 、 
(1 + u) 2 + v 2 


1 — u 2 — v 2 
(1 + u) 2 + V 2 


> 0 


since \w\ < 1. Therefore G maps the unit disc to the upper half-plane. 
Finally, 


F(G(w)) 


1—W 

l-\-W 


1—w 

l-\-w 


1 -\- W — 1 -\- W 
1 -\- W ~\~ 1 — W 


and similarly G(F(z)) = z. This proves the theorem. 


An interesting aspect of these functions is their behavior on the bound¬ 
aries of our open sets. 2 Observe that F is holomorphic everywhere on C 


2 The boundary behavior of conformal maps is a recurrent theme that plays an impor¬ 
tant role in this chapter. 
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except at z = —i, and in particular it is continuous everywhere on the 
boundary of H, namely the real line. If we take z = x real, then the 
distance from a; to i is the same as the distance from x to —z, there¬ 
fore |_F(x)| = 1. Thus F maps R onto the boundary of D. We get more 
information by writing 




i — x 
i-\- x 


1 — x 2 . 2x 


and parametrizing the real line by x = tan /： with t G (— 7 r/ 2 , 7 t/ 2 ). Since 


. 2 tana 1 — tan 2 a 

sin 2a = - ^ — and cos 2a = - ^ — 

1 + tan a 1 + tan a 


we have F(x) = cos 2t + i sin 2t = e l2t . Hence the image of the real line 
is the arc consisting of the circle omitting the point — 1. Moreover, as x 
travels from —oo to oo, F(x) travels along that arc starting from —1 and 
first going through that part of the circle that lies in the lower half-plane. 

The point —1 on the circle corresponds to the “point at infinity” of 
the upper half-plane. 

Remark. Mappings of the form 


az -\-b 
cz 


where a, 6 , c, and d are complex numbers, and where the denominator is 
assumed not to be a multiple of the numerator, are usually referred to 
as fractional linear transformations. Other instances occur as the 
automorphisms of the disc and of the upper half-plane in Theorems 2.1 
and 2.4. 


1.2 Further examples 

We gather here several illustrations of conformal mappings. In certain 
cases we discuss the behavior of the map on the boundary of the relevant 
domain. Some of the mappings are pictured in Figure 1. 

Example 1. Translations and dilations provide the first simple examples. 
Indeed, if /i G C, the translation z z h is conformal map from C 
to itself whose inverse is w w — h. If h is real, then this translation is 
also a conformal map from the upper half-plane to itself. 

For any non-zero complex number c, the map f : z cz is 3, conformal 
map from the complex plane to itself, whose inverse is simply g : w 
c~ 1 w. If c has modulus 1, so that c = e l(p for some real cp, then / is 
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a rotation by (p. If c > 0 then / corresponds to a dilation. Finally, if 
c < 0 the map / consists of a dilation by |c| followed by a rotation of 7r. 

Example 2. If n is a positive integer, then the map z z n is conformal 
from the sector 5 = { 2 : G C : 0 < arg(z) < 7r/n} to the upper half-plane. 
The inverse of this map is simply w 1 —>■ w” n , defined in terms of the 
principal branch of the logarithm. 

More generally, if 0 < a < 2 the map f(z) = takes the upper half¬ 
plane to the sector *5 = {^ G C : 0 < arg(^) < an}. Indeed, if we choose 
the branch of the logarithm obtained by deleting the positive real axis, 
and 2 ： = re l6 with r > 0 and 0 < 0 < 7r, then 

f(z) = z a = >| a e—. 

Therefore / maps El into S. Moreover, a simple verification shows that 
the inverse of / is given by g(w) = w 1 卜 ， where the branch of the loga¬ 
rithm is chosen so that 0 < argtt; < an. 

By composing the map just discussed with the translations and rota¬ 
tions in the previous example, we may map the upper half-plane confor¬ 
mally to any (infinite) sector in C. 

Let us note the boundary behavior of /. If x travels from —oc to 0 on 
the real line, then f{x) travels from ooe m7r to 0 on the half-line deter¬ 
mined by arg 2 ： = oltt. As x goes from 0 to 00 on the real line, the image 
f(x) goes from 0 to 00 on the real line as well. 

Example 3. The map f(z) = (1 + z)/(l — z) takes the upper half¬ 
disc {z = x iy : |z| < 1 and 2 / > 0} conformally to the first quadrant 
{w = u-\- iv : u > 0 and v > 0}. Indeed, if z = x iy we have 

l~(^ 2 + y 2 ) _ 2y _ 

八 )— {l-x) 2 +y 2 {l-x ) 2 + y 2, 

so f maps the half-disc in the upper half-plane into the first quadrant. 
The inverse map, given by g(w) = (w — l)/(w 1), is clearly holomor- 

phic in the first quadrant. Moreover, \w 1\ > \w — 1\ for all w in the 
first quadrant because the distance from 忉 to —1 is greater than the 
distance from w to 1; thus g maps into the unit disc. Finally, an easy 
calculation shows that the imaginary part of g(w) is positive whenever w 
is in the first quadrant. So g transforms the first quadrant into the 
desired half-disc and we conclude that / is conformal because g is the 
inverse of /. 

To examine the action of / on the boundary, note that if z = e l ° be- 
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longs to the upper half-circle, then 

l + e i0 e~ i6 / 2 + e ie ! 2 




1 — e ie e~ i6 / 2 — e ie l 2 tan(0/2) • 


As 6 travels from 0 to 丌 we see that f(e ie ) travels along the imaginary 
axis from infinity to 0. Moreover, if z = x is real, then 


/ ㈤ 


is also real; and one sees from this, that / is actually a bijection from 
(— 1， 1) to the positive real axis, with f(x) increasing from 0 to infinity 
as x travels from — 1 to 1. Note also that /(0) = 1. 


Example 4. The map 2 ： 1 -^ log 之 , defined as the branch of the logarithm 
obtained by deleting the negative imaginary axis, takes the upper half¬ 
plane to the strip {w = u iv : u E 0 < v < 7r}. This is immediate 
from the fact that if 2 ： = re ie with —7r/2 < 6 < 37r/2, then by definition, 

logz = logr + iO. 

The inverse map is then w e w . 

As x travels from — oc to 0, the point f(x) travels from 00 + Z7r to 
—00 + in on the line {x in : —00 < x < 00}. When x travels from 0 
to 00 on the real line, its image f(x) then goes from —00 to 00 along the 
reals. 


Example 5. With the previous example in mind, we see that 
2 : 1 —> log z also defines a conformal map from the half-disc {z = x iy : 
| 之 | < 1, y > 0 } to the half-strip {w = u iv : u < 0, 0 < v < 7 r}. As x 
travels from 0 to 1 on the real line, then logx goes from —00 to 0. 
When x goes from 1 to —1 on the half-circle in the upper half-plane, 
then the point logx travels from 0 to iri on the vertical segment of the 
strip. Finally, as x goes from —1 to 0 , the point logx goes from iri to 
—00 + in on the top half-line of the strip. 

Example 6. The map f(z) = e lz takes the half-strip {z = x iy : 
—7r/2 < x < 7r/2, 2 / > 0} conformally to the half-disc {w = u-\-iv : 
\w\ < 1, u > 0}. This is immediate from the fact that if z = x + iy, 
then 
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If x goes from 7r/2 + zoo to 7r/2, then f(x) goes from 0 to z, and as x 
goes from 7r/2 to —7r/2, then f(x) travels from i to —i on the half-circle. 
Finally, as x goes from —7r/2 to —7r/2 + zcxd, we see that f(x) travels 
from —i back to 0. 

The mapping / is closely related to the inverse of the map in Exam¬ 
ple 5. 

Example 7. The function f(z) = —\{z + l/z) is a conformal map from 
the half-disc {z = x iy : \z\ < 1, y > 0} to the upper half-plane (Exer¬ 
cise 5). 

The boundary behavior of / is as follows. If x travels from 0 to 1, then 
f(x) goes from oo to 1 on the real axis. If z = e 10 , then f(z) = cosO and 
as x travels from 1 to —1 along the unit half-circle in the upper half¬ 
plane, the f(x) goes from 1 to —1 on the real segment. Finally, when x 
goes from —1 to 0, f(x) goes from —1 to —oo along the real axis. 

Example 8. The map f(z)= sin 2： takes the upper half-plane confor¬ 
mally onto the half-strip {w = x iy : —7r/2 < x < 7r/2 y > 0}. To see 
this, note that if C = e tz ， then 



and therefore / is obtained first by applying the map in Example 6, then 
multiplying by i (that is, rotating by 7r/2), and finally applying the map 
in Example 7. 

As x travels from —7r/2 + zoo to —7r/2, the point f(x) goes from —00 
to —1. When x is real, between —7r/2 and 7r/2, then f(x) is also real 
between —1 and 1. Finally, if x goes from 7r/2 to 7r/2 + zoo, then f(x) 
travels from 1 to 00 on the real axis. 

1.3 The Dirichlet problem in a strip 

The Dirichlet problem in the open set consists of solving 


Au = 0 in f], 
u = f on dfl 


⑴ 


where A denotes the Laplacian d 2 /dx 2 + d 2 /dy 2 , and / is a given func¬ 
tion on the boundary of f]. In other words, we wish to find a harmonic 
function in Q with prescribed boundary values /. This problem was al¬ 
ready considered in Book I in the cases where f] is the unit disc or the 
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fi(z) = e tz f 2 (z) = iz f 3 (z) = ^ - \) 



Figure 1. Explicit conformal maps 


upper half-plane, where it arose in the solution of the steady-state heat 
equation. In these specific examples, explicit solutions were obtained in 
terms of convolutions with the Poisson kernels. 

Our goal here is to connect the Dirichlet problem with the conformal 
maps discussed so far. We begin by providing a formula for a solution to 
the problem (1) in the special case where f] is a strip. In fact, this exam- 
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pie was studied in Problem 3 of Chapter 5, Book I, where the problem 
was solved using the Fourier transform. Here, we recover this solution 
using only conformal mappings and the known solution in the disc. 

The first important fact that we use is that the composition of a har¬ 
monic function with a holomorphic function is still harmonic. 

Lemma 1.3 Let V and U be open sets in C and F : V ^ U a holo¬ 
morphic function. If u : U ^ C is a harmonic function, then u o F is 
harmonic on V. 

Proof. The thrust of the lemma is purely local, so we may assume 
that U is an open disc. We let G be a holomorphic function in U whose 
real part is u (such a G exists by Exercise 12 in Chapter 2, and is deter¬ 
mined up to an additive constant). Let H = G o F and note that uo F 
is the real part of H. Hence u o F is harmonic because H is holomorphic. 

For an alternate (computational) proof of this lemma, see Exercise 6. 

With this result in hand, we may now consider the problem (1) when 
f] consists of the horizontal strip 


f] = {x + z?/ : x G M, 0 < ? / < 1} 


whose boundary is the union of the two horizontal lines M and i + M. We 
express the boundary data as two functions /o and fi defined on IR, and 
ask for a solution u{x^ y) in of Au = 0 that satisfies 


u(x,0) = f 0 (x) and u(x, 1) = fi(x). 


We shall assume that /o and /i are continuous and vanish at infinity, 
that is, that limi^i^oo fj(x) = 0 for j = 0,1. 

The method we shall follow consists of relocating the problem from 
the strip to the unit disc via a conformal map. In the disc the solution 
u is then expressed in terms of a convolution with the Poisson kernel. 
Finally, u is moved back to the strip using the inverse of the previous 
conformal map, thereby giving our final answer to the problem. 

To achieve our goal, we introduce the mappings F : D —> f] and 
G : f] —^ D, that are defined by 



These two functions, which are obtained from composing mappings from 
examples in the previous sections, are conformal and inverses to one 
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i\ 

u = fi 

n z = iy 

n 

U = fo 一 

Figure 2. The Dirichlet problem in a strip 


another. Tracing through the boundary behavior of F, we find that it 
maps the lower half-circle to the line i + R, and the upper half-circle to 
R. More precisely, as ip travels from — 丌 to 0, then F(e lcp ) goes from 
z + oo to i — oo, and as (f travels from 0 to 丌 ， then F(e lcp ) goes from —oo 
to oc on the real line. 

With the behavior of F on the circle in mind, we define 
/l(^) = j\[F(e l(p ) — i) whenever —n < cp < 0, 

and 

/o(^) = fQ(F(e t(f )) whenever 0 < (f < n. 


Then, since /o and /i vanish at infinity, the function / that is equal 
to /i on the lower semi-circle, /o on the upper semi-circle, and 0 at the 
points cp = 士 7r, 0, is continuous on the whole circle. The solution to the 
Dirichlet problem in the unit disc with boundary data / is given by the 
Poisson integral 3 


H w ) = ^ / Pr(0- (f)f(^p) d(p 


2 丌. <t 

where w = re l6 , and 




Pr(0) 


1 - r 2 


1 — 2r cos 9 -\- r 2 


3 We refer the reader to Chapter 2 in Book I for a detailed discussion of the Dirichlet 
problem in the disc and the Poisson integral formula. Also, the Poisson integral formula 
is deduced in Exercise 12 of Chapter 2 and Problem 2 in Chapter 3 of this book. 
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is the Poisson kernel. Lemma 1.3 guarantees that the function u : defined 
by 

u(z) = u(G(z)), 


is harmonic in the strip. Moreover, our construction also insures that u 
has the correct boundary values. 

A formula for u in terms of /o and /i is first obtained at the points 2 ：= 
iy with 0 < y < 1. The appropriate change of variables (see Exercise 7) 
shows that if re l6 = G(iy), then 


2tt 



Pr(0 - ^)/o(^) d(f = 


sin iry 
2 



_ M _ dt 

cosh 7rt — cos Try 


A similar calculation also establishes 


2tt 



Pr(0 - <^)/l(^) ^ = 


sin Try 「 j\(t) 

2 J_ 00 cosh nt + cos iry 


dt. 


Adding these last two integrals provides a formula for u(0,y). In gen¬ 
eral, we recall from Exercise 13 in Chapter 5 of Book I, that a solution to 
the Dirichlet problem in the strip vanishing at infinity is unique. Conse¬ 
quently, a translation of the boundary condition by x results in a trans¬ 
lation of the solution by x as well. We may therefore apply the same 
argument to fo(x + t) and fi(x + t) (with x fixed), and a final change of 
variables shows that 


u[x,y ) 二 


sin Try 
2 



fo(x - t) 
cosh nt — cos Try 



fl(x - t) 
cosh 7rt + cos Try 



which gives a solution to the Dirichlet problem in the strip. In partic¬ 
ular, we find that the solution is given in terms of convolutions with 
the functions /o and /i. Also, note that at the mid-point of the strip 
(y = 1/2), the solution is given by integration with respect to the func¬ 
tion 1/cosh 7rt; this function happens to be its own Fourier transform, 
as we saw in Example 3, Chapter 3. 


Remarks about the Dirichlet problem 

The example above leads us to envisage the solution of the more general 
Dirichlet problem for f] (a suitable region), if we know a conformal map F 
from the disc D to That is, suppose we wish to solve (1), where / is 
an assigned continuous function and d^l is the boundary of f]. Assuming 
we have a conformal map F from D to f] (that extends to a continuous 
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bijection of the boundary of the disc to the boundary of f]), then / = 
f o F is defined on the circle, and we can solve the Dirichlet problem 
for the disc with boundary data /. The solution is given by the Poisson 
integral formula 



where P r is the Poisson kernel. Then, one can expect that the solution 
of the original problem is given by u = u o F— 1 . 

Success with this approach requires that we are able to resolve affir¬ 
matively two questions: 

• Does there exist a conformal map $ = F~ x from Q to D? 

• If so, does this map extend to a continuous bijection from the 
boundary of f] to the boundary of D? 

The first question, that of existence, is settled by the Riemann mapping 
theorem, which we prove in the next section. It is completely general 
(assuming only that f] is a proper subset of C that is simply connected), 
and necessitates no regularity of the boundary of f]. A positive answer 
to the second question requires some regularity of dVl. A particular case, 
when f] is the interior of a polygon, is treated below in Section 4.3. (See 
Exercise 18 and Problem 6 for more general assertions.) 

It is interesting to note that in Riemann’s original approach to the 
mapping problem, the chain of implications was reversed: his idea was 
that the existence of the conformal map $ from f] to D is a consequence 
of the solvability of the Dirichlet problem in Q. He argued as follows. 
Suppose we wish to find such a with the property that a given point 
2 ：o G is mapped to 0. Then $ must be of the form 


^(z) ^(z- z 0 )G(z) 


where G is holomorphic and non-vanishing in f]. Hence we can take 


^(z) = (z — Zo)e H ^ 


for suitable H. Now if u(z) is the harmonic function given by u = Re(H), 
then the fact that | 少 ( 之 )| = 1 on dfl means that u must satisfy the bound¬ 
ary condition u(z) = log(l/| 2 ： — 2 ： o|) for 2 : G dil. So if we can find such a 
solution u of the Dirichlet problem, 4 we can construct H, and from this 
the mapping function 


4 The harmonic function u(z) is also known as the Green’s function with source zo for 
the region Q. 
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However, there are several shortcomings to this method. First, one 
has to verify that $ is a bijection. In addition, to succeed, this method 
requires some regularity of the boundary of f]. Moreover, one is still 
faced with the question of solving the Dirichlet problem for f2. At this 
stage Riemann proposed using the “Dirichlet principle.” But applying 
this idea involves difficulties that must be overcome. 5 

Nevertheless, using different methods, one can prove the existence of 
the mapping in the general case. This approach is carried out below in 
Section 3. 

2 The Schwarz lemma; automorphisms of the disc and 
upper half-plane 

The statement and proof of the Schwarz lemma are both simple, but the 
applications of this result are far-reaching. We recall that a rotation is 
a map of the form z cz with |c| = 1, namely c = e l6 ， where 0 G IK. is 
called the angle of rotation and is well-defined up to an integer multiple 
of 2tt. 

Lemma 2.1 Let / : D —>■ D be holomorphic with /(0) = 0. Then 

(i) |/(^)| < |^| for all z eB. 

(ii) If for some 2 ：o # 0 切 e have |/(^o)| = | 之 o |， then f is a rotation. 

(iii) l/^O)! < 1 7 and if equality holds, then f is a rotation. 

Proof. We first expand / in a power series centered at 0 and conver¬ 
gent in all of D 

/ (z) = ao + CL\Z + d2^ 2 + _ _ _ • 

Since /(0) = 0 we have a。= 0, and therefore f(z)/z is holomorphic in 
D (since it has a removable singularity at 0). If \z\ = r < 1, then since 
|/( 之 ) I < 1 we have 

/(^) < 1 
— ， 
z r 

and by the maximum modulus principle, we can conclude that this is 
true whenever \z\ < r. Letting r —^ 1 gives the first result. 

For (ii), we see that f(z)/z attains its maximum in the interior of D and 
must therefore be constant, say f(z) = cz. Evaluating this expression 


5 An implementation of Dirichlet，s principle in the present two-dimensional situation is 
taken up in Book III. 
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at zo and taking absolute values, we find that |c| = 1. Therefore, there 
exists 0 E R such that c = e^, and that explains why / is a rotation. 

Finally, observe that if g(z) = f(z)/z, then \g(z)\ < 1 throughout D, 
and moreover 


夕 ⑼ =lim 

z—^0 


似 -m 

z 


=/ ， (o). 


Hence, if |/’(0)| = 1, then |^(0)| = 1, and by the maximum principle g is 
constant, which implies f(z) = cz with |c| = 1. 


Our first application of this lemma is to the determination of the au¬ 
tomorphisms of the disc. 


2.1 Automorphisms of the disc 

A conformal map from an open set f] to itself is called an automor¬ 
phism of Cl. The set of all automorphisms of is denoted by Aut(fi), 
and carries the structure of a group. The group operation is composition 
of maps, the identity element is the map z ^ z, and the inverses are sim¬ 
ply the inverse functions. It is clear that if / and g are automorphisms 
of fi, then f o g is also an automorphism, and in fact, its inverse is given 
by 

(fog)— 1 = 0 f— 1 . 


As mentioned above, the identity map is always an automorphism. We 
can give other more interesting automorphisms of the unit disc. Obvi¬ 
ously, any rotation by an angle 0 G M, that is, re : z ^ e l0 z, is an auto¬ 
morphism of the unit disc whose inverse is the rotation by the angle —0, 
that is, T-q : 2 ： i—>• e~ lQ z. More interesting, are the automorphisms of the 
form 


Hz) 


a 


1-az 


where a G C with |a| < 1. 


These mappings, which where introduced in Exercise 7 of Chapter 1, 
appear in a number of problems in complex analysis because of their 
many useful properties. The proof that they are automorphisms of D is 
quite simple. First, observe that since |a| < 1, the map is holomorphic 
in the unit disc. If | 之 | = 1 then 2 : = e l6 and 




i0\ 


a 


AS 


-iQ 


e ie (e~ ie - a) 


w 

W' 


where w = a — e lQ , therefore = 1. By the maximum modulus 

principle, we conclude that \^ a (z)\ < 1 for all z G D. Finally we make 
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the following very simple observation: 


d 。 寸 a ) ( z ) 二 


a 


l-az 


1-a 


l-az 


a — \a\ 2 z — a-\- z 
1 —az — \a\ 2 + az 

(! - IO 

i - H 2 


from which we conclude that is its own inverse! Another important 
property of is that it vanishes at z = a\ moreover it interchanges 0 
and a, namely 

^a(O) = a and 0 a (a) = 0. 


The next theorem says that the rotations combined with the maps 
exhaust all the automorphisms of the disc. 

Theorem 2.2 If f is an automorphism of the disc, then there exist 6 G 
R and a G D such that 


f(z) = e 


i6 


a 


1 — az 


Proof. Since / is an automorphism of the disc, there exists a unique 
complex number a G D such that /(a) = 0. Now we consider the au¬ 
tomorphism g defined by ^ = / o Then ^(0) = 0, and the Schwarz 
lemma gives 


(2) |^(^)| ^ |^| for all ^ G D. 

Moreover, 分 _1 (0) = 0, so applying the Schwarz lemma to g- 1 , we find 
that 

< \w\ for all -w; G D. 

Using this last inequality for w = g(z) for each 2: G D gives 

(3) |z| < \g(z)\ for all 2 ： G D. 

Combining (2) and (3) we find that \g(z)\ = |^| for all 2 ： G D, and by the 
Schwarz lemma we conclude that g(z) = e ld z for some 0 G IR. Replacing 
z by -0 a (z) and using the fact that o ^ 0 )(^) — we deduce that 
f(z) = as claimed. 

Setting a = 0 in the theorem yields the following result. 
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Corollary 2.3 The only automorphisms of the unit disc that fix the ori¬ 
gin are the rotations. 

Note that by the use of the mappings we can see that the group of 
automorphisms of the disc acts transitively, in the sense that given any 
pair of points a and /3 in the disc, there is an automorphism -0 mapping 
a to p. One such ^ is given by ^ o 

The explicit formulas for the automorphisms of D give a good de¬ 
scription of the group Aut(D). In fact, this group of automorphisms is 
“almost” isomorphic to a group of 2 x 2 matrices with complex entries 
often denoted by SU(1,1). This group consists of all 2 x 2 matrices that 
preserve the hermitian form on C 2 x C 2 defined by 

{Z, W) = ZiW 1 - Z 2 W 2 , 

where Z = (zi,2:2) and W = (切 1 ，切 2). For more information about this 
subject, we refer the reader to Problem 4. 

2.2 Automorphisms of the upper half-plane 

Our knowledge of the automorphisms of D together with the conformal 
map F : H —>■ D found in Section 1.1 allow us to determine the group of 
automorphisms of H which we denote by Aut(H). 

Consider the map 

r : Aut(D) —>• Aut(H) 
given by “conjugation by F”: 

r((/?) = F~ l o ip o F. 


It is clear that r(p) is an automorphism of H whenever ip is an auto¬ 
morphism of D, and r is a bijection whose inverse is given by r _1 (0)= 
F o 咕 o _F _1 . In fact, we prove more, namely that T preserves the oper¬ 
ations on the corresponding groups of automorphisms. Indeed, suppose 
that ^ Aut(D). Since F o F _1 is the identity on D we find that 

r((/?i o (^2) = -F -1 o (fi O (f2 0 F 

=o (pi o F o F — 1 o (f 2 0 F 

=o r(^ 2 ). 


The conclusion is that the two groups Aut(D) and Aut(H) are the same, 
since T defines an isomorphism between them. We are still left with the 
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task of giving a description of elements of Aut(H). A series of calcula¬ 
tions, which consist of pulling back the automorphisms of the disc to the 
upper half-plane via F, can be used to verify that Aut(H) consists of all 
maps 

az -\-b 

— — 

cz-\- a 


where a, 6, c, and d are real numbers with ad — be = 1. Again, a matrix 
group is lurking in the background. Let SL 2 (M) denote the group of all 
2x2 matrices with real entries and determinant 1, namely 


SL 2 (M)= 





a, 6, c, d G M and det(M) = ad — bc=l 


This group is called the special linear group. 

Given a matrix M G SL 2 (IR) we define the mapping /m by 


/ mO ) 二 


az -\-b 
cz -\- d 


Theorem 2.4 Every automorphism of H takes the form /m for some 
M G SIj2(IR). Conversely, every map of this form is an automorphism of 


H. 


The proof consists of a sequence of steps. For brevity, we denote the 
group SL 2 (M) by Q. 

Step 1. If M G then /m maps H to itself. This is clear from the 
observation that 


⑷ Im (/ M (参 M |: 6 _y) => 。 whenever . G H. 

Step 2. If M and M f are two matrices in then /m o jw — /mm ，. 
This follows from a straightforward calculation, which we omit. As a 
consequence, we can prove the first half of the theorem. Each /m is an 
automorphism because it has a holomorphic inverse (/m) - 1 ? which is 
simply /m_i. Indeed, if / is the identity matrix, then 

(Jm 。 / m - 1 )( 2 ) 二 / mm - 1 ( 2 ) = fl(z) = z. 

Step 3. Given any two points z and w in H, there exists M E ： G such 
that /m ( 之 )= 忉 , and therefore Q acts transitively on H. To prove this, 
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it suffices to show that we can map any 2 ： G H to i. Setting d = 0 in 
equation (4) above gives 


W / m ⑷) = 喪 


and we may choose a real number c so that Iiii(/m(^)) = 1- Next we 
choose the matrix 


Mi = 




so that /mi (^) has imaginary part equal to 1. Then we translate by a 
matrix of the form 


M2 = 




with 6 G M, 


to bring /mi(^) to i. Finally, the map /m with M = M 2 Mi takes 2 ： to i. 
Step 4. If 9 is real, then the matrix 


Me = 


cos 6 
sin 6 


— sin 6 
cos 6 


belongs to and if F : H —^ D denotes the standard conformal map, then 
F o f Me o F~ l corresponds to the rotation of angle —26 in the disc. This 
follows from the fact that F o f Me = e~ 2ie F(z), which is easily verified. 

Step 5. We can now complete the proof of the theorem. We suppose 
f is an automorphism of H with f(j3)=i, and consider a matrix N E G 
such that /at( 0 = /?• Then g = f o f N satisfies g{i) = z, and therefore 
F o g o F— 1 is an automorphism of the disc that fixes the origin. So 
Fog 。 F~ l is a rotation, and by Step 4 there exists 0 G M such that 

Fo go F~ x 二 F 。 f Me o F- 1 . 

Hence g = /m 0 , and we conclude that / = /mqN- 1 which is of the desired 
form. 

A final observation is that the group Ant (H) is not quite isomorphic 
with SL 2 (M). The reason for this is because the two matrices M and —M 
give rise to the same function /m = /-m. Therefore, if we identify the 
two matrices M and —M, then we obtain a new group PSL 2 (M) called 
the projective special linear group; this group is isomorphic with 
Aut(H). 
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3 The Riemann mapping theorem 

3.1 Necessary conditions and statement of the theorem 

We now come to the promised cornerstone of this chapter. The basic 
problem is to determine conditions on an open set that guarantee the 
existence of a conformal map F : f] —^ D. 

A series of simple observations allow us to find necessary conditions 
on f]. First, if f] = C there can be no conformal map F : ^ D, since 

by Liouville’s theorem F would have to be a constant. Therefore, a 
necessary condition is to assume that ^ C. Since D is connected, we 
must also impose the requirement that f] be connected. There is still 
one more condition that is forced upon us: since D is simply connected, 
the same must be true of f] (see Exercise 3). It is remarkable that 
these conditions on Q, are also sufficient to guarantee the existence of a 
biholomorpism from f] to D. 

For brevity, we shall call a subset f] of C proper if it is non-empty 
and not the whole of C. 

Theorem 3.1 (Riemann mapping theorem) Suppose is proper and 
simply connected. If zo G then there exists a unique conformal map 
F : f] —>• D such that 


F(zq) = 0 and F^^o) > 0. 

Corollary 3.2 Any two proper simply connected open subsets in C are 
conformally equivalent. 

Clearly, the corollary follows from the theorem, since we can use as 
an intermediate step the unit disc. Also, the uniqueness statement in 
the theorem is straightforward, since if F and G are conformal maps 
from Q to D that satisfy these two conditions, then H = F o G _1 is an 
automorphism of the disc that fixes the origin. Therefore H(z) = e ld z, 
and since i/’( 0 ) 〉 0 , we must have e z6> = 1, from which we conclude that 
F = G. 

The rest of this section is devoted to the proof of the existence of the 
conformal map F. The idea of the proof is as follows. We consider all 
injective holomorphic maps / : f] —>• D with f(zo) = 0. From these we 
wish to choose an / so that its image fills out all of D, and this can be 
achieved by making f f (zo) as large as possible. In doing this, we shall 
need to be able to extract / as a limit from a given sequence of functions. 
We turn to this point first. 
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3.2 Montel’s theorem 

Let f] be an open subset of C. A family T of holomorphic functions on 
f] is said to be normal if every sequence in T has a subsequence that 
converges uniformly on every compact subset of f] (the limit need not be 
in T). 

The proof that a family of functions is normal is, in practice, the con¬ 
sequence of two related properties, uniform boundedness and equiconti- 
nuity. These we shall now define. 

The family T is said to be uniformly bounded on compact subsets 

of f] if for each compact set if C f] there exists S > 0, such that 

\f{ z )\ ^ B for all z E K and / G J 7 . 

Also, the family T is equicontinuous on a compact set K if for every 
e > 0 there exists 5 > 0 such that whenever z,w E K and — w;| < 5, 
then 


I/O)-/ ㈣ |<e for all f e J 7 . 

Equicontinuity is a strong condition, which requires uniform continuity, 
uniformly in the family. For instance, any family of differentiable func¬ 
tions on [0,1] whose derivatives are uniformly bounded is equicontinuous. 
This follows directly from the mean value theorem. On the other hand, 
note that the family {/ n } on [0,1] given by f n (x) = x n is not equicon¬ 
tinuous since for any fixed 0 < Xq < 1 we have |/ n (l) — f n (xo)\ —> 1 as n 
tends to infinity. 

The theorem that follows puts together these new concepts and is an 
important ingredient in the proof of the Riemann mapping theorem. 

Theorem 3.3 Suppose T is a family of holomorphic functions on f] that 
is uniformly bounded on compact subsets of O. Then: 

(i) T is equicontinuous on every compact subset of Q. 

(ii) T is a normal family. 

The theorem really consists of two separate parts. The first part says 
that T is equicontinuous under the assumption that J 7 is a family of 
holomorphic functions that is uniformly bounded on compact subsets of 
f]. The proof follows from an application of the Cauchy integral formula 
and hence relies on the fact that J- consists of holomorphic functions. 
This conclusion is in sharp contrast with the real situation as illustrated 
by the family of functions given by f n (x)= sin(nx) on (0,1), which is 


226 


Chapter 8. CONFORMAL MAPPINGS 


uniformly bounded. However, this family is not equicontinuous and has 
no convergent subsequence on any compact subinterval of ( 0 , 1 ). 

The second part of the theorem is not complex-analytic in nature. 
Indeed, the fact that J 7 is a normal family follows from assuming only 
that T is uniformly bounded and equicontinuous on compact subsets of 
f]. This result is sometimes known as the Arzela-Ascoli theorem and its 
proof consists primarily of a diagonalization argument. 

We are required to prove convergence on arbitrary compact subsets of 
Q, therefore it is useful to introduce the following notion. A sequence 
of compact subsets of f] is called an exhaustion if 

(a) Ki is contained in the interior of K^i for all i = 1 , 2 ,.... 

(b) Any compact set if C f] is contained in for some In particular 

oo 

n^\jK e . 

£=1 

Lemma 3.4 Any open set f] in the complex plane has an exhaustion. 


Proof. If f] is bounded, we let denote the set of all points in f] 
at distance > l/£ from the boundary of f]. If f] is not bounded, let 
denote the same set as above except that we also require |z| < t for all 
z G K 込 . 

We may now begin the proof of MontePs theorem. Let K be a compact 
subset of f] and choose r > 0 so small that D^ r {z) is contained in f] for 
all z ^ K. It suffices to choose r so that 3r is less than the distance 
from K to the boundary of f]. Let z,w C K with \z — w\ < r, and let 
7 denote the boundary circle of the disc D 2 r (w). Then, by Cauchy’s 
integral formula, we have 


f(z) — f{w) 


2ni 


/(C) 


C, — z C ~ w 


dC 


Observe that 


1 


—z C ~ w 


jz — w\ 


ic-y ic-^i 


< 


z — w\ 


since ^ G 7 and \z — w\ < r. Therefore 

|/(z) — /H| < b\z-w 
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where B denotes the uniform bound for the family T in the compact 
set consisting of all points in f] at a distance < 2r from K. Therefore 
\f(z) — f(w)\ < C\z — w\, and this estimate is true for all z，w C K with 
\z — w\ < r and f E T; thus this family is equicontinuous, as was to be 
shown. 

To prove the second part of the theorem, we argue as follows. Let 
{/n}^Li be a sequence in T and K a compact subset of f]. Choose a 
sequence of points {wj} ( jL 1 that is dense in f2. Since {/ n } is uniformly 
bounded, there exists a subsequence {/ n ,i} = {/i,i, / 2 ,i, f 3 ,i ,..of {/ n } 
such that / n ,i(^i) converges. 

From {/ n ,i} we can extract a subsequence {/ n ， 2 } = {/i ?2 , / 2 , 2 , , 3 , 2 , •..} 
so that f n , 2 (W 2 ) converges. We may continue this process, and extract a 
subsequence {/ n ，)} of {/ n ,j-i} such that f n ,j{^j) converges. 

Finally, let g n = f n , n and consider the diagonal subsequence {^ n }. By 
construction, g n {wj) converges for each and we claim that equiconti- 
nuity implies that g n converges uniformly on K. Given e > 0, choose S 
as in the definition of equicontinuity, and note that for some J, the set 
K is contained in the union of the discs Ds(wi ),..., Ds(wj). Pick N so 
large that if n, m 〉 TV, then 

\9m{wj) - 9n(wj)\ < e for all j ■二 1,. • •, 

So if z G K, then 2 ： G Ds[Wj) for some 1 < j < J. Therefore, 

\ 9 n(z) - g m (z)\ < \g n (z) - g n {wj)\ + \g n {wj) - g m {wj)\ + 

+ \9m(wj) - g m (z)\ < 3e 

whenever n, m > N. Hence {^ n } converges uniformly on K. 

Finally, we need one more diagonalization argument to obtain a sub¬ 
sequence that converges uniformly on every compact subset of f]. Let 
K\ C K 2 C ... C iQ C ... be an exhaustion of and suppose {g n ,i} is 
a subsequence of the original sequence {/ n } that converges uniformly on 
K\. Extract from {g n ,i} a subsequence { 5 ^, 2 } that converges uniformly 
on K 2 , and so on. Then, {g n ,n} is a subsequence of {/ n } that converges 
uniformly on every Ki and since the Ki exhaust f], the sequence {g n ,n} 
converges uniformly on any compact subset of f2, as was to be shown. 


We need one further result before we can give the proof of the Riemann 
mapping theorem. 


Proposition 3.5 If Q is a connected open subset of C and {/ n } a 5e- 
quence of injective holomorphic functions on f] that converges uniformly 
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on every compact subset of Q. to a holomorphic function f, then 
either injective or constant. 


is 


Proof. We argue by contradiction and suppose that / is not injective, 
so there exist distinct complex numbers z\ and Z 2 in such that f(zi)= 
/(Z 2 ). Define a new sequence by g n {z) = / n (^) — / n (之 l), so that g n has 
no other zero besides zi, and the sequence {gn} converges uniformly on 
compact subsets of f] to g(z) = f(z) — If g is not identically zero, 

then Z 2 is an isolated zero for g (because f] is connected); therefore 


丄 f 

2 vri J g(Q 


dC 


where 7 is a small circle centered at Z 2 chosen so that g does not vanish 
on 7 or at any point of its interior besides Z 2 . Therefore, l/g n converges 
uniformly to 1 /^ on 7 , and since g r n —»• g r uniformly on 7 we have 


g'n(0 

9n(C) 


dC 


g '(0 


d(. 


2 vri J 1 g n 、Q 2m g(C) 

But this is a contradiction since g n has no zeros inside 7 , and hence 

1 f 9'n(0 


27r* J ffn(C) 


成 = 0 for all n. 


3.3 Proof of the Riemann mapping theorem 

Once we have established the technical results above, the rest of the 
proof of the Riemann mapping theorem is very elegant. It consists of 
three steps, which we isolate. 

Step 1. Suppose that f] is a simply connected proper open subset of 
C. We claim that f] is conformally equivalent to an open subset of the 
unit disc that contains the origin. Indeed, choose a complex number a 
that does not belong to f], (recall that f] is proper), and observe that 
z — a never vanishes on the simply connected set f]. Therefore, we can 
define a holomorphic function 

/ ㈤二 log( 2 ； - a) 

with the desired properties of the logarithm. As a consequence one has, 
e /W = 

z — which proves in particular that / is injective. Pick a point 
tt; G and observe that 


f(z) 7 ^ f(^) + 27rz for all z G 
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for otherwise, we exponentiate this relation to find that z = w, hence 
f(z) = /(忉)， a contradiction. In fact, we claim that f(z) stays strictly 
away from f(w) + 2ni, in the sense that there exists a disc centered at 
f(w) + that contains no points of the image /(f]). Otherwise, there 
exists a sequence {z n } in f] such that f(z n ) — /(w;) + 2tH. We exponen¬ 
tiate this relation, and, since the exponential function is continuous, we 
must have z n ^ w. But this implies f(z n ) —>• f(w), which is a contra¬ 
diction. Finally, consider the map 


F(z) 


1 


f(z) - (/ ㈣ + 2 丌 i). 


Since / is injective, so is F, hence ^ is a conformal map. 

Moreover, by our analysis, F(f2) is bounded. We may therefore translate 
and rescale the function F in order to obtain a conformal map from f] 
to an open subset of D that contains the origin. 


Step 2. By the first step, we may assume that f] is an open subset of D 
with 0 G f2. Consider the family T of all injective holomorphic functions 
on f] that map into the unit disc and fix the origin: 


J 7 = {/ : ^ D holomorphic, injective and /(0) = 0}. 


First, note that T is non-empty since it contains the identity. Also, 
this family is uniformly bounded by construction, since all functions are 
required to map into the unit disc. 

Now, we turn to the question of finding a function j ^ T that max¬ 
imizes |/’(0)|. First, observe that the quantities |/’(0)| are uniformly 
bounded as / ranges in T. This follows from the Cauchy inequality 
(Corollary 4.3 in Chapter 2) for f f applied to a small disc centered at 
the origin. 

Next, we let 

s = sup |/’(0)|, 
feT 

and we choose a sequence {/ n } C T such that |/^(0)| —> 5 as n —>• oo. By 
MontePs theorem (Theorem 3.3), this sequence has a subsequence that 
converges uniformly on compact sets to a holomorphic function / on Q. 
Since 5 > 1 (because z z belongs to ^ 7 ), / is non-constant, hence injec¬ 
tive, by Proposition 3.5. Also, by continuity we have 
|/(^)| ^ 1 for all z E Q and from the maximum modulus principle we 
see that \f(z)\ < 1. Since we clearly have /(0) = 0, we conclude that 
/ £ with \ f(0)\ = S. 
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Step 3. In this last step, we demonstrate that / is a conformal map 
from f] to D. Since / is already injective, it suffices to prove that / is 
also surjective. If this were not true, we could construct a function in T 
with derivative at 0 greater than s. Indeed, suppose there exists a G D 
such that f(z) ^ a, and consider the automorphism ^ of the disc that 
interchanges 0 and a, namely 


K z )= 


a — z 
1 — az 


Since Q is simply connected, so is [/ = o /)(fi), and moreover, U 
does not contain the origin. It is therefore possible to define a square 
root function on U by 

g(w)^ei losw . 


Next, consider the function 

F 二 ogoip a o f. 

We claim that F ^ T. Clearly F is holomorphic and it maps 0 to 0. 
Also F maps into the unit disc since this is true of each of the functions 
in the composition. Finally, F is injective. This is clearly true for the 
automorphisms and it is also true for the square root g and 

the function /, since the latter is injective by assumption. If h denotes 
the square function h(w) = w\ then we must have 

oho tp-f a) oF^^oF. 

But $ maps D into D with $(0) = 0, and is not injective because F is 
and h is not. By the last part of the Schwarz lemma, we conclude that 
|$’(0)| < 1. The proof is complete once we observe that 

/(O) = ^(0)^(0), 

and thus 

I/' ⑼ I < | 铲⑼ I ， 

contradicting the maximality of |/’(0)| in T. 

Finally, we multiply / by a complex number of absolute value 1 so 
that /’(0) > 0, which ends the proof. 

For a variant of this proof, see Problem 7. 
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Remark. It is worthwhile to point out that the only places where 
the hypothesis of simple-connectivity entered in the proof were in the 
uses of the logarithm and the square root. Thus it would have suf¬ 
ficed to have assumed (in addition to the hypothesis that f] is proper) 
that f] is holomorphically simply connected in the sense that for 
any holomorphic function / in f] and any closed curve 7 in f], we have 
f(z) dz = 0. Further discussion of this point, and various equivalent 
properties of simple-connectivity, are given in Appendix B. 

4 Conformal mappings onto polygons 

The Riemann mapping theorem guarantees the existence of a conformal 
map from any proper, simply connected open set to the disc, or equiv¬ 
alently to the upper half-plane, but this theorem gives little insight as 
to the exact form of this map. In Section 1 we gave various explicit 
formulas in the case of regions that have symmetries, but it is of course 
unreasonable to ask for an explicit formula in the general case. There 
is, however, another class of open sets for which there are nice formulas, 
namely the polygons. Our aim in this last section is to give a proof of 
the Schwarz-Christoffel formula, which describes the nature of conformal 
maps from the disc (or upper half-plane) to polygons. 

4.1 Some examples 

We begin by studying some motivating examples. The first two corre¬ 
spond to easy (but infinite and degenerate) cases. 

Example 1. First, we investigate the conformal map from the upper 
half-plane to the sector { 之 ： 0 < arg z < a 7 r}, with 0 < a < 2 , given in 
Section 1 by f(z) = z a . Anticipating the Schwarz-Christoffel formula 
below, we write 



with a + /? = 1 , and where the integral is taken along any path in the 
upper half-plane. In fact, by continuity and Cauchy’s theorem, we may 
take the path of integration to lie in the closure of the upper half-plane. 
Although the behavior of / follows immediately from the original defi¬ 
nition, we study it in terms of the integral expression above, since this 
provides insight for the general case treated later. 

Note first that is integrable near 0 since /3 < 1, therefore /(0) = 0. 
Observe that when 2 : is real and positive (z = x), then f\x) = ax a_1 is 
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positive; also it is not finitely integrable at oo. Therefore, as x travels 
from 0 to oo, we see that f(x) increases from 0 to oo, thus / maps [ 0 , oo) 
to [0, oo). On the other hand, when z = x is negative, then 

f'(z ) 二 = -alxl^e^ , 

so f maps the segment (—oo, 0] to (e Z 7 ra oo, 0]. The situation is illustrated 
in Figure 3 where the infinite segment A is mapped to A! and the segment 
B is mapped to B’，with the direction of travel indicated in Figure 3. 



Example 2. Next, we consider for 2: G H, 

dC 

( 1 - c 2 )" 2 ， 

where the integral is taken from 0 to z along any path in the closed 
upper half-plane. We choose the branch for (1 — C 2 ) 1 , 2 that makes it 
holomorphic in the upper half-plane and positive when —1 < ^ < 1 . As 
a result 

(1 - C 2 ) _1/ 2 = i(C 2 - 1) _1/2 when C > 1 - 



We observe that / maps the real line to the boundary of the half-strip 
pictured in Figure 4. 

In fact, since /( 士 1) = 士 7 r/ 2 , and f f (x) > 0 if — 1 < x < 1, we see that 
f maps the segment B to B r . Moreover 


f(x ) 二吾 + 乂 f'( x ) dx 


when x > 1 , 


and 



dx 

(x 2 - l )" 2 


= oo. 


Thus, as x travels along the segment C, the image traverses the infinite 
segment C'. Similarly segment A is mapped to A!. 

Note the connection of this example with Example 8 in Section 1.2. In 
fact, one can show that the function f(z) is the inverse to the function 
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A / 、， A C 


7T 7T 

— 2 B f 2 


Figure 4. Mapping of the boundary in Example 2 


sinz, and hence / takes H conformally to the interior of the half-strip 
bounded by the segments A! ^ B\ and C'. 


Example 3. Here we take 

/*Z 

f(z) = Jo [( 1 -( 2 )( 1 - 代 2 )]" 2 , 

where /c is a fixed real number with 0 < fc < 1 (the branch of 
[(1 — C 2 ) (1 — A^ 2 )] 1 / 2 in the upper half-plane is chosen to be the one 
that is positive when C is real and —1 < ^ < 1). Integrals of this kind 
are called elliptic integrals, because variants of these arise in the cal¬ 
culation of the arc-length of an ellipse. We shall observe that / maps the 
real axis onto the rectangle shown in Figure 5(b), where K and K’ are 
determined by 

K= f 1 _ ^ = f 1/k _^_ 

_ J 0 [{l - x 2 ){\ - k 2 x^ 2 )} 1 / 2, — 人 [(x 2 - 1)(1 - ^X 2 )} 1 / 2 ' 

We divide the real axis into four “segments,” with division points 
—1/fc, —1, 1, and 1/k (see Figure 5(a)). The segments are [—1/fc, —1], 
[—1,1], [1, 1/k], and [1/k,—1/k], the last consisting of the union of the 
two half-segments [1/k, oo) and (—oo, —1/k]. It is clear from the defini¬ 
tions that /( 士 1) = =bi^, and since f r {x) > 0, when —1 < x < 1, it follows 
that / maps the segment [—1,1] to [-K, K]. Moreover, since 

m = K + l [(1 — C 2 ) 二 V )]" 2 ' lil<X<1 ^ 
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ABC D E 


-1/k -1 1 1/k 

fa) 


-K-\- iK' A ' _ E， K + iK f 

B ， 、， h D' 

-K C' K 

(b) 


Figure 5. Mapping of the boundary in Example 3 


we see that / maps the segment [1 ， 1/fc] to [K^ K + iK'\ where K' was 
defined above. Similarly, / maps [—1/fc, — 1] to [-K + iK f , —K]. Next, 
when x > 1/k we have 


/’⑷ = 


1 

[{x 2 — l)(k 2 x 2 — l)] 1 / 2 ’ 


and therefore, 


f(x) = K + iK , 



dx 

[(w-ikw-i)] 1 / 2 . 


However, 



dx 

[{x 2 -l)(k 2 x 2 -l)Y/ 2 



dx 

[(i-wki-Px 2 )] 1 / 2 , 


as can be seen by making the change of variables x = 1/ku in the in¬ 
tegral on the left. Thus / maps the segment [l/A:,oo) to the segment 
[K + %K\ iK r ). Similarly / maps (—oo ， —1/fc] to [-K + iK\ %K'\ Al¬ 
together, then, / maps the real axis to the above rectangle, with the 
point at infinity corresponding to the mid-point of the upper side of the 
rectangle. 


The results obtained so far lead naturally to two problems. 

The first, which we pursue next, consists of a generalization of the 
above examples. More precisely we define the Schwarz-Christoffel inte¬ 
gral and prove that it maps the real line to a polygonal line. 

Second, we note that in the examples above little was inferred about 
the behavior of / in H itself. In particular, we have not shown that / 
maps H conformally to the interior of the corresponding polygon. Af¬ 
ter a careful study of the boundary behavior of conformal maps, we 
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prove a theorem that guarantees that the conformal map from the upper 
half-plane to a simply connected region bounded by a polygonal line is 
essentially given by a Schwarz-Christoffel integral. 


4.2 The Schwarz-Christoffel integral 

With the examples of the previous section in mind, we define the general 

Schwarz-Christoffel integral by 

⑸ 5(Z) = l 

Here Ai < A2 <• — < A n are n distinct points on the real axis arranged 
in increasing order. The exponents (3k will be assumed to satisfy the 
conditions < 1 for each k and 1 < X^ =1 /?fc. 6 

The integrand in (5) is defined as follows: (z — Ak)^ k is that branch 
(defined in the complex plane slit along the infinite ray {Ak + iy : 2 / < 0}) 
which is positive when z = x is real and x > Ak. As a result 


(z- A k ) 0k = 


(x- A k Y k 

\x-A k \ 0k e i7T/：)k 


if x is real and x > 
if x is real and x < A^. 


The complex plane slit along the union of the rays L}^ =1 {Ak iy ： y < 0} 
is simply connected (see Exercise 19), so the integral that defines S(z) is 
holomorphic in this open set. Since the requirement f3^ < 1 implies that 
the singularities (C — A^)~^ k are integrable near the function S is 

continuous up to the real line, including the points with k = 1 ,..., n. 

Finally, this continuity condition implies that the integral can be taken 
along any path in the complex plane that avoids the union of the open 
slits U^ =1 {A k -\-iy : y <0}. 

Now 

n 

< C |cr EA 

k=l 


for I (I large, so the assumption > 1 guarantees the convergence of 

the integral (5) at infinity. This fact and Cauchy’s theorem imply that 
lim r ^oo S(re ie ) exists and is independent of the angle 0， 0 < 0 < n. We 
call this limit a^, and we let = S(A^) for k = 1 ,..., n. 


6 Note that the case $ 1, which occurs in Examples 1 and 2 above is excluded. 

However, a modification of the proposition that follows can be made to take these cases 
into account; but then S(z) is no longer bounded in the upper half-plane. 
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Proposition 4.1 Suppose S(z) is given by (5). 

(i) IfHi Pk = an d p denotes the polygon whose vertices are given 
(in order) by ai, … ， a n , then S maps the real axis onto p — {g^o}. 
The point lies on the segment [a n , a\] and is the image of the 
point at infinity. Moreover, the (interior) angle at the vertex ak is 
a/c 7 r where = 1 — /3^. 

(ii) There is a similar conclusion when 1 < 0k < except now 

the image of the extended line is the polygon of n 1 sides with 
vertices ai, a 2 ,..., a n , a^. The angle at the vertex is 
with OLoo = 1 _ Poo y where /3qo = 2 — 〉 : &_ 工 (3^ • 

Figure 6 illustrates the proposition. The idea of the proof is already 
captured in Example 1 above. 


(2l CLoo 




■Ai Ak A n 


M 




Figure 6. Action of the integral S(z) 


Proof. We assume that X^=i A = 2. If < x < A^i when 
1 < < n — 1, then 

s \ x ) = - A ?) -内 

j<k j>k 
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Hence 


a,TgS\x) = arg Aj)~^ 二 ar g JJ e - 吨二 


J>k 


j>k 


j>k 


which of course is constant when x traverses the interval (Ak^ Ak~\~i). 
Since 



we see that as x varies from to S(x) varies from S(Ak) = 

ak to S(A^i) = a/c + i along the straight line segment 7 [a^,a^+i], and 
this makes an angle of —7r w ith the real axis. Similarly, when 

A n < x then S r {x) is positive, while if x < the argument of S\x) 
is —7r Y12=i ^ = _2 丌 , and so S r {x) is again positive. Thus as x varies 
on [A n , +cxd), S(x) varies along a straight line (parallel to the x-axis) 
between a n and a^; similarly S(x) varies along a straight line (parallel 
to that axis) between and ai as x varies in (— 00 , Ai]. Moreover, the 
union of [a n , a^) and (a^^ai] is the segment [a n , ai] with the point 
removed. 

Now the increase of the angle of [a^+i,a^] over that of [ 0 ^_ 1 , 叫 ] is 
7 r/3fc, which means that the angle at the vertex is 7ro^. The proof 
when 1 < X^ =1 /3fc < 2 is similar, and is left to the reader. 

As elegant as this proposition is, it does not settle the problem of 
finding a conformal map from the half-plane to a given region P that is 
bounded by a polygon. There are two reasons for this. 

1. It is not true for general n and generic choices of Ai, … ， A n , that 
the polygon (which is the image of the real axis under S) is simple, 
that is, it does not cross itself. Nor is it true in general that the 
mapping S is conformal on the upper half-plane. 

2. Neither does the proposition show that starting with a simply con¬ 
nected region P (whose boundary is a polygonal line p) the mapping 
S is, for certain choices of Ai,..., A n and simple modifications, a 
conformal map from H to P. That however is the case, and is the 
result whose proof we now turn to. 


7 We denote the closed straight line segment between two complex numbers 2 ： and w 
by [z, w], that is, [z, w] = {(1 — t)z + tty : t G [0,1]}. If we restrict 0 < t < 1, then (z, w) 
denotes the open line segment between z and w. Similarly for the half-open segments 
[z, w) and (z, w] obtained by restricting 0 < t < 1 and 0 < t < 1, respectively. 
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4.3 Boundary behavior 

In what follows we shall consider a polygonal region P, namely a 
bounded, simply connected open set whose boundary is a polygonal line 
p. In this context, we always assume that the polygonal line is closed, 
and we sometimes refer to p as a polygon. 

To study conformal maps from the half-plane H to P, we consider first 
the conformal maps from the disc D to P, and their boundary behavior. 

Theorem 4.2 If F : U) —>■ P is a conformal map, then F extends to a 
continuous bijection from the closure D of the disc to the closure P of 
the polygonal region. In particular, F gives rise to a bijection from the 
boundary of the disc to the boundary polygon p. 

The main point consists in showing that if zo belongs to the unit circle, 
then \im z ^ ZQ F(z) exists. To prove this, we need a preliminary result, 
which uses the fact that if/ :[/—>/([/) is conformal, then 



This assertion follows from the definition, Area(/(C/)) = dx dy, 

and the fact that the determinant of the Jacobian in the change of vari¬ 
ables w = f(z) is simply \ f f (z)\ 2 ^ an observation we made in equation (4), 
Section 2.2, Chapter 1 . 

Lemma 4.3 For each 0 < r < 1/2 ， let C r denote the circle centered at 
Zq of radius r. Suppose that for all sufficiently small r we are given 
two points z r and z’ r in the unit disc that also lie on C r . If we let 
p(r) = \ f(z r ) — f(z f r )\, then there exists a sequence {r n } of radii that 
tends to zero, and such that lim n _, 00 p(r n ) = 0. 

Proof. If not, there exist 0 < c and 0 < i? < 1/2 such that c < p(r) 
for all 0 < r < i?. Observe that 



where the integral is taken over the arc a on C r that joins z r and z' r in 
D. If we parametrize this arc by zq + re l6 with 0i{r) < 0 < 02(0, then 
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We now apply the Cauchy-Schwarz inequality to see that 


P{r) < 


r02(r) 


1/2 


^i(r) 


\f'(z)\ 2 rde 


f*0 2 {r) 


1/2 


r d6 




Squaring both sides and dividing by r yields 


P(r ) 2 


< 2tt 


r^ 2 (r) 
JOi(r) 


\f\z)\ 2 rd9. 


We may now integrate both sides from 0 to ii, and since c < p{r) on that 
region we obtain 


r R fj r r R r e ^ r ) r r 

C 2 / — <2n / \f'{z)\ 2 rdOdr <2n \f (z)\ 2 dxdy. 

Jo r Jo Je^r) J Jb 


Now the left-hand side is infinite because 1/r is not integrable near the 
origin, and the right-hand side is bounded because the area of the polyg¬ 
onal region is bounded, so this yields the desired contradiction and con¬ 
cludes the proof of the lemma. 


Lemma 4.4 Let zq be a point on the unit circle. Then F(z) tends to a 
limit as z approaches zq within the unit disc. 

Proof. If not, there are two sequences {^i, Z 2 ? - - •} and {zj, ^ 2 ? - - •} 
in the unit disc that converge to and are so that F(z n ) and F(z! n ) 
converge to two distinct points ( and ^ in the closure of P. Since F 
is conformal, the points C and must lie on the boundary p of P. We 
may therefore choose two disjoint discs D and D r centered at C and C’, 
respectively, that are at a distance d > 0 from each other. For all large 
n, F(z n ) G D and F(z’ n ) G D r . Therefore, there exist two continuous 
curves 8 A and 八 ’ in fl _P and D r D P, respectively, with F(z n ) G A and 
F{z r n ) G A f for all large n, and with the end-points of A and A/ equal to 
C and (’ ， respectively. 

Define A = F _1 (A) and X f = F-\A’). Then A and X f are two continu¬ 
ous curves in D. Moreover, both A and A / contain infinitely many points 
in each sequence {z n } and {z^. Recall that these sequences converge to 
zq. By continuity, the circle C r centered at zq and of radius r will inter¬ 
sect A and for all small r, say at some points G A and z' r G A / . This 


8 By a continuous curve, we mean the image of a continuous (not necessarily piecewise- 
smooth) function from a closed interval [a, b] to C. 




240 Chapter 8. CONFORMAL MAPPINGS 




Figure 7. Illustration for the proof of Lemma 4.4 


contradicts the previous lemma, because \F(z r ) — F{z r r )\ > d. Therefore 
F(z) converges to a limit on p as 2 ： approaches zq from within the unit 
disc, and the proof is complete. 

Lemma 4.5 The conformal map F extends to a continuous function 
from the closure of the disc to the closure of the polygon. 

Proof. By the previous lemma, the limit 

lim F(z) 

z ― >Zq 


exists, and we define F(zo) to be the value of this limit. There re¬ 
mains to prove that F is continuous on the closure of the unit disc. 
Given e, there exists 6 such that whenever 之 G D and — 2 ： o| < then 
\F(z) — F(zo)\ < e. Now if 2 ： belongs to the boundary of D and 
\z — zo\ < 5, then we may choose w such that \F(z) — F(w)\ < e and 
\w — zo\ < 5. Therefore 

\F(z) - F(z 0 )\ < \F(z) - F(w)\ + \F(w) - F(z 0 )\<2e, 


and the lemma is established. 

We may now complete the proof of the theorem. We have shown that 
F extends to a continuous function from D to P. The previous argument 
can be applied to the inverse G of F. Indeed, the key geometric property 
of the unit disc that we used was that if zo belongs to the boundary of 
D, and C is any small circle centered at zq, then CflD consists of an 
arc. Clearly, this property also holds at every boundary point of the 
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polygonal region P. Therefore, G also extends to a continuous function 
from P to D. It suffices to now prove that the extensions of F and G 
are inverses of each other. If 2 ： G 9D and {zk} is a sequence in the disc 
that converges to z, then G(F(zk)) = Zfc, so after taking the limit and 
using the fact that F is continuous, we conclude that G(F(z)) = z for all 
2 : G D. Similarly, F(G(w)) = w for all w G P, and the theorem is proved. 

The circle of ideas used in this proof can be used to prove more general 
theorems on the boundary continuity of conformal maps. See Exercise 18 
and Problem 6 below. 

4.4 The mapping formula 

Suppose P is a polygonal region bounded by a polygon p whose vertices 
are ordered consecutively ai, a 2 ,..., a n , and with n > 3. We denote by 
nak the interior angle at ajt, and define the exterior angle by + 
/3k = l - A simple geometric argument provides /3k = 2. 

We shall consider conformal mappings of the half-plane H to P, and 
make use of the results of the previous section regarding conformal maps 
from the disc D to P. The standard correspondences w = (i — z), 

z = i(l — w)/(l + w) allows us to go back and forth between z eM. and 
it; G O. Notice that the boundary point w = ~1 of the circle corresponds 
to the point at infinity on the line, and so the conformal map of H to 
D extends to a continuous bijection of the boundary of H, which for the 
purpose of this discussion includes the point at infinity. 

Let F be a conformal map from H to P. (Its existence is guaranteed by 
the Riemann mapping theorem and the previous discussion.) We assume 
first that none of the vertices of p correspond to the point at infinity. 
Therefore, there are real numbers Ai, A 2 ,..., A n so that F(A^) = for 
all k. Since F is continuous and injective, and the vertices are numbered 
consecutively, we may conclude that the A^s are in either increasing 
or decreasing order. After relabeling the vertices and the points 
we may assume that A± < A 2 < - — < A n . These points divide the real 
line into n — 1 segments [A^, Afc+i ]，1 < fc < n — 1 , and the segment that 
consists of the join of the two half-segments (— 00 , Ai] U [A n , 00 ). These 
are mapped bijectively onto the corresponding sides of the polygon, that 
is, the segments [a/c, a^+i], 1 < fc < n — 1, and [a n , a\] (see Figure 8). 

Theorem 4.6 There exist complex numbers c\ and C 2 so that the con¬ 
formal map F ofM. to P is given by 

F(z) = ciS(z) + c 2 

where S is the Schwarz-Christoff el integral introduced in Section 4.2. 
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Figure 8. The mapping F 


Proof. We first consider 2 ： in the upper half-plane lying above the 
two adjacent segments [Ak~i,Ak] and where 1 < k < n. We 

note that F maps these two segments to two segments that intersect at 
ak = F(Ak) at an angle 7rak. 

By choosing a branch of the logarithm we can in turn define 
hk(z) = {F(z) - a k ) 1/ak 


for all z in the half-strip in the upper half-plane bounded by the lines 
Re(z) = and Re(z) = Afc+i. Since F continues to the boundary of 

H, the map is actually continuous up to the segment A^i) 

on the real line. By construction will map the segment [A^-i, A^i] 
to a (straight) segment in the complex plane, with mapped to 
0. We may therefore apply the Schwarz reflection principle to see that 
hk is analytically continuable to a holomorphic function in the two-way 
infinite strip A^-i < Re(z) < (see Figure 9). We claim that h f k 

never vanishes in that strip. First, if 2 ： belongs to the open upper half¬ 
strip, then 

F'{z) K(z) 

F{z) - F(A k ) ~ k h k (zY 

and since F is conformal, we have F’ （ z) ^ 0 so h r k {z) ^ 0 (Proposi¬ 
tion 1.1). By reflection, this also holds in the lower half-strip, and it 
remains to investigate points on the segment (Ak-i, If Ak-i < 

x < Afc+i, we note that the image under 〜of a small half-disc centered 
at x and contained in IHI lies on one side of the straight line segment 








4. Conformal mappings onto polygons 


243 



I 


^-k-l 

1 

Ak 

Ak-\-i 


Figure 9. Schwarz reflection 


Lk- Since hk is injective up to Lk (because F is) the symmetry in the 
Schwarz reflection principle guarantees that hk is injective in the whole 
disc centered at x, whence h r k {x) ^ 0, whence h r k {z) ^ 0 for all z in the 
strip A k -i < Re(z) < A k+ i. 

Now because F r = akh^ k h r k and F" = -Pk^kh^ k ~ X (h^) 2 -\- 
akh^^ k h f ^ the fact that h f k (z) ^ 0 implies that 


F ,r {z) 


~Pk 


+ Ek(z) ， 


where Ek is holomorphic in the strip Ak-i < Re(z) < A&+i. A similar 
result holds for k = 1 and k = n, namely 


F ;, (z) 


A 


+ 五 "i ， 


F f (z) — z- A 1 
where E\ is holomorphic in the strip —oo < Re( 2 ：) < A 2 , and 


F"(z) (3 n 

F'{z) ~ z-A n + n 


where E n is holomorphic in the strip A n -\ < Re( 2 ：) < 00 . Finally, an¬ 
other application of the reflection principle shows that F is continuable 
in the exterior of a disc | 2 ：| < i?, for large R (say R > maxi</c< n \Ak\). In¬ 
deed, we may continue F across the union of the segments 
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(— 00 , Ai) U (A n , 00 ) since their image under F is a straight line seg¬ 
ment and Schwarz reflection applies. The fact that F maps the upper 
half-plane to a bounded region shows that the analytic continuation of F 
outside a large disc is also bounded, and hence holomorphic at infinity. 
Thus F" jF' is holomorphic at infinity and we claim that it goes to 0 as 
| 2 ：| —>• 00 . Indeed, we may expand F at 2 ： = 00 as 

F(z) = c 0 + — + ^| H - . 

z z A 

This after differentiation shows that F n /F r decays like 1/z as | 之 | becomes 
large, and proves our claim. 

Altogether then, because the various strips overlap and cover the entire 
complex plane, 

F"{z) y. I3 k 
F'{z) ^ Z-A k 

K=1 

is holomorphic in the entire plane and vanishes at infinity; thus, by Li- 
ouville’s theorem it is zero. Hence 

F" (z) — — (3k 

F\z) = ~ Z -A k 

«=丄 

From this we contend that F\z) = c(z — Ai)~^ x _ _. (z — A n )~^ n . In¬ 
deed, denoting this product by Q(z), we have 

Q\z) — /?fc 

k^l Z ~ 

Therefore 

dz V Q(z )) ， 

which proves the contention. A final integration yields the theorem. 

We may now withdraw the hypothesis we made at the beginning that 
F did not map the point at infinity to a vertex of P, and obtain a formula 
for that case as well. 


Theorem 4.7 If F is a conformal map from the upper half-plane to the 
polygonal region P and maps the points Ai,..., ^4 n _i, 00 to the vertices 
of p, then there exist constants C\ and C 2 such that 


F(z) = Ci 



_ dC _ 

(C - ■ ■ ■ (C - A n ^)^ 


+ C2. 
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In other words, the formula is obtained by deleting the last term in 
the Schwarz-Christoffel integral (5). 

Proof. After a preliminary translation, we may assume that Aj ^ 0 
for j = 1. Choose a point A* > 0 on the real line, and consider 

the fractional linear map defined by 

叫）二疋—丄. 

Then $ is an automorphism of the upper half-plane. Let A^. = 
for k = 1,...,n — 1, and note that A* = $(oo). Then 

(F o = cik for all fc = 1,2,..., n. 


We can now apply the Schwarz- Christ off el formula just proved to find 
that 


(Fo^~ 1 )(z f ) = C 1 


dC 


(C- A\)^ — A* n )^ 


+ C 2 . 


The change of variables ( = ^(w) satisfies = dw/w 2 : and since we can 
write 2 = /?i + ... + /3 n , we obtain 





dw 

dw 


+ c r 2 

T + ^2* 


Finally, we note that 1/(A* — A* k ) = and set = 2 : in the above 

equation to conclude that 


F(z) = 



dw 

(w — Ai )^ 1 … 


+ ^25 


as was to be shown. 


4.5 Return to elliptic integrals 

We consider again the elliptic integral 


I(z) 


r dc 

Jo [(1 — 0(1 — 代 2 )]" 2 


with 0 < fc < 1, 
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which arose in Example 3 of Section 4.1. We saw that it mapped the real 
axis to the rectangle R with vertices —K, K, K + iK r ， and —K + iK' . 
We will now see that this mapping is a conformal mapping of H to the 
interior of R. 

According to Theorem 4.6 there is a conformal map F to the rectangle, 
that maps four points on the real axis to the vertices of R. By preceding 
this map with a suitable automorphism of H we may assume that F 
maps —1, 0, 1 to —if, 0, K, respectively. Indeed, by using a preliminary 
automorphism, we may assume that —K^ 0, K are the images of points 
Ai, 0, A 2 with Ai < 0 < A 2 ； then we can further take A\ = — 1 and 
^2 = 1. See Exercise 15. 

Next, let £ be chosen with 0 < < 1, so that l/£ is the point on the 

real line mapped by F to the vertex K + iK f : which is the vertex next in 
order after —K and K . We claim that F(—l/£) is the vertex —K + iK r . 
Indeed, if F*(z) = —F(—z), then by the symmetry of i?, F* is also a 
conformal map of H to R; moreover F*(0) = 0, and F* (士 1) = 士 Thus 
F _1 o F* is an automorphism of H that fixes the points —1,0, and 1. 
Hence F ~ 1 o F* is the identity (see Exercise 15), and F = F* , from which 
it follows that 

F(-l/£) = -F(l/i) = -K + iK，. 

Therefore, by Theorem 4.6 

PZ 

F{z) ^ Cl Jo [(l-C 2 )(l-m] 1/2+C2 . 

Setting z = 0 gives c 2 = 0, and letting z = 1, z = l/£, yields 


K(k) = ciif(£) and K\k) = c x K\t), 


where 


I<(k) 


dx 


[(l-z 2 )(l-fc 2 x 2 )]V 2 


K\k) 



dx 

[(x 2 — 1)(1 — fc 2 ^ 2 )] 1 / 2 * 


Now K(k) is clearly strictly increasing as k varies in (0,1). Moreover, a 
change of variables (Exercise 24) establishes the identity 

K\k) = K(k) where k 2 = 1 — k 2 and fc > 0, 
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and this shows that K r {k) is strictly decreasing. Hence K{k)/K f {k) 
is strictly increasing. Since K(k)/K , (k) = we must have 

k = £, and finally ci = 1. This shows that I(z) = F(z), and hence I is 
conformal, as was to be proved. 

A final observation is of significance. A basic insight into elliptic in¬ 
tegrals is obtained by passing to their inverse functions. We therefore 
consider ^ i—^ sn(z), the inverse map of z i—> I(z). 9 It transforms the 
closed rectangle into the closed upper half-plane. Now consider the se¬ 
ries of rectangles R = i? 0 , -Ri, i? 2 , •.. gotten by reflecting successively 
along the lower sides (Figure 10). 




K 


K-iK' 


K - 2iK f 
Figure 10. Reflections oi R = Rq 




Ri 


R2 


With sn(z) defined in Rq, we can by the reflection principle extend 
it to i?i by setting sn( 2 ：) = sn(^) whenever z G Ri (note that then z G 
Ro). Next we can extend sn(z) to R 2 by setting sn(z) = sn(—iK f + z) if 
2: G i?2 and noting that if z e i?2? then —%K' + ^ G i?i. Combining these 
reflections and continuing this way we see that we can extend sn(z) in 
the entire strip —K < Re(z) < K, so that sn(z) = sn(z + 2 iK f ). 

Similarly, by reflecting in a series of horizontal rectangles, and combin¬ 
ing these with the previous reflections, we see that sn(z) can be continued 
to the complex plane and also satisfies sn( 2 ：) = sn(z + AK). Thus sn(z) 
is doubly periodic (with periods AK and 2iK r ). A further examination 


9 The notation sn( 2 ：) in somewhat different form is due to Jacobi, and was adopted 
because of the analogy with sinz. 
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shows that the only singularities sn(z) are poles. Functions of this type, 
called “elliptic functions , ,5 are the subject of the next chapter. 

5 Exercises 

1. A holomorphic mapping f : U ^ V is a, local bijection on U if for every z E U 
there exists an open disc D C U centered at so that f : D — /(D) is a bijection. 

Prove that a holomorphic map f : U V is a, local bijection on U if and only 
if /W 0 for all z£U. 

[Hint: Use Rouche 5 s theorem as in the proof of Proposition 1.1.] 

2. Supppose F(z) is holomorphic near z = zo and F(zo) = F f (zo) = 0, while 
F n (zo) 0. Show that there are two curves Ti and r 2 that pass through zo, 
are orthogonal at 2 ： o, and so that F restricted to Ti is real and has a minimum at 
zo, while F restricted to T 2 is also real but has a maximum at zq. 

[Hint: Write F(z) = (g(z)) 2 for z near zo, and consider the mapping 2 : 1 —> 3 ( 2 ) and 
its inverse.] 


3. Suppose U and V are conformally equivalent. Prove that if U is simply con¬ 
nected, then so is V. Note that this conclusion remains valid if we merely assume 
that there exists a continuous bijection between U and V. 

4. Does there exist a holomorphic surjection from the unit disc to C? 

[Hint: Move the upper half-plane “down” and then square it to get C.] 

5. Prove that f(z) = —\{z + 1/^) is a conformal map from the half-disc 
{z — x -\- iy : \z\ < 1 , y > 0 } to the upper half-plane. 

[Hint: The equation f(z) = w reduces to the quadratic equation z 2 + 2wz + 1 = 0, 
which has two distinct roots in C whenever iy _ 士 1. This is certainly the case if 
w G M.] 

6 . Give another proof of Lemma 1.3 by showing directly that the Laplacian of 
uo F is zero. 

[Hint: The real and imaginary parts of F satisfy the Cauchy-Riemann equations.] 

7. Provide all the details in the proof of the formula for the solution of the Dirichlet 
problem in a strip discussed in Section 1.3. Recall that it suffices to compute the 
solution at the points z = iy with 0 < y < 1. 

(a) Show that if re ld = G(iy), then 


ie . cos ivy 

re = i- -:- • 

1 + sm ny 

This leads to two separate cases: either 0 < y < 1/2 and 6 = 7 r/ 2 , or 1/2 < 
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y < 1 and 6 = —7r/2. In either case, show that 
2 1 - sin 7ry 


1 + sin 7ry 


and P r (6 — (p) 


sm Try 


1 — cos ivy sin ip 


(b) In the integral 去 P r {0 — ^p)fo((p) d(p make the change of variables 
F(e Zip ). Observe that 


i + e 7rt ’ 


and then take the imaginary part and differentiate both sides to establish 
the two identities 


smp 


cosh 7ft 


and 


dip 


dt cosh 7vt 


Hence deduce that 


2?r 


Pr(0 - (p)fo((f) dip : 


sin ny 


2n J 0 1 — cos ivy sirup 
sin ?ry /o ⑴ 


2 


/o(^) dip 


• dt. 


cosh 7rt — cos Try 


(c) Use a similar argument to prove the formula for the integral 
SF p r(° - ^)/l(^) dip. 


8 . Find a harmonic function u in the open first quadrant that extends continuously 
up to the boundary except at the points 0 and 1, and that takes on the following 
boundary values: u(x, y) = 1 on the half-lines {y = 0, a: > 1} and {x = 0, y > 0}, 
and u(x, y) = 0 on the segment {0 < x < l,y = 0}. 

[Hint: Find conformal maps F±, F 2 ,..., indicated in Figure 11. Note that 
^ arg( 2 ：) is harmonic on the upper half-plane, equals 0 on the positive real axis, 
and 1 on the negative real axis.] 


9. Prove that the function u defined by 


is harmonic in 
bounded in D. 


u(x, y) — Re 


i-\-z y 
i-z, 


and w(0,1) = 0 


the unit disc and vanishes on its boundary. Note that u is not 


10. Let F : H —>• C be a holomorphic function that satisfies 


\F(z)\ < 1 and F{i) = 0. 











250 


Chapter 8. CONFORMAL MAPPINGS 




0 

Figure 11. 


0 

Successive conformal maps in Exercise 8 


Prove that 


i^)i < 


z — i 
z -\-i 


for all 2 : G H. 


11. Show that if / : D(0, _R) —> C is holomorphic, with |/(z)| < M for some M > 0, 
then 

/Wjo) 

- MR' 

[Hint: Use the Schwarz lemma.] 

12. A complex number 切 G ID) is a fixed point for the map / : D —>• D if f(w) = w. 

(a) Prove that if / : D ^ D is analytic and has two distinct fixed points, then / 
is the identity, that is, f(z) = z for all z G O. 
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(b) Must every holomorphic function / : D ^ D have a fixed point? [Hint: Con¬ 
sider the upper half-plane.] 


13. The pseudo-hyperbolic distance between two points G D is defined by 


p(z,w)= 


z — w 
1 — wz 


(a) Prove that if / : D —• D is holomorphic, then 

P(f( z )J( w )) < p(z,w) for all z,w e ID) . 


Moreover, prove that if / is an automorphism of D then / preserves the 
pseudo-hyperbolic distance 


f(w)) = p(z,w) for all 2 ：, w; G P. 


[Hint: Consider the automorphism = (z — a)/(l — az) and apply the 

Schwarz lemma to ipf( w ) ° f ° ^w 1 •] 


(b) Prove that 


irwi < i 

i-i/wi 2 


for all z GB). 


This result is called the Schwarz-Pick lemma. See Problem 3 for an impor¬ 
tant application of this lemma. 


14. Prove that all conformal mappings from the upper half-plane H to the unit 
disc D take the form 

e» e R and /3 G M. 


15. Here are two properties enjoyed by automorphisms of the upper half-plane. 

(a) Suppose $ is an automorphism of H that fixes three distinct points on the 
real axis. Then $ is the identity. 

(b) Suppose (xi, X2, X3) and (y 1,1/2, 2 / 3 ) are two pairs of three distinct points on 
the real axis with 


xi < X2 < X3 and yi < y2 < 2 / 3 . 

Prove that there exists (a unique) automorphism $ of HI so that 中 (Xj) = yj, 
j = 1 ， 2,3. The same conclusion holds if 2/3 < 2/1 < 2/2 or 2/2 < 2/3 < 2/i- 
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16. Let 

/ ⑷ and 厂 1 ( w ) = 4 t 5. 


(a) Given 0 G M, find real numbers a, b, c, d such that ad — be = 1, and so that 
for any z EM 


az + b 
cz -\- d 



(b) Given a G ID) find real numbers a,b,c y d so that ad — be = 1, and so that for 
any 2 ； € H 


az + b 
cz -\- d 




with 心 defined in Section 2.1. 

(c) Prove that if g is an automorphism of the unit disc, then there exist real 
numbers a, 6 , c, d such that ad — bc= 1 and so that for any z G H 


az -\-b 
cz + d 


=r 1 ° 9° f(z). 


[Hint: Use parts (a) and (b).] 


17. If = (a — z)/(l — az) for \a\ < 1, prove that 

lJjW^dxdy=l and IJJj^dxdy^ log T ^， 

where in the case a = 0 the expression on the right is understood as the limit as 

|a| —>■ 0. 

[Hint: The first integral can be evaluated without a calculation. For the second, 
use polar coordinates, and for each fixed r use contour integration to evaluate the 
integral in 0.] 

18. Suppose that Q is a simply connected domain that is bounded by a piecewise- 
smooth closed curve 7 (in the terminology of Chapter 1). Then any conformal 
map F of D to extends to a continuous bijection of D to fl. The proof is simply 
a generalization of the argument used in Theorem 4.2. 

19. Prove that the complex plane slit along the union of the rays 
Uk = i{Ak iy : y < 0} is simply connected. 

[Hint: Given a curve, first “raise” it so that it is completely contained in the upper 
half-plane.] 


20. Other examples of elliptic integrals providing conformal maps from the upper 
half-plane to rectangles are given below. 
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(a) The function 

I — , (l 二 . with A € M and A 7 ^ 1 

Jo VC(C- 1 )(C-A) 

maps the upper half-plane conformally to a rectangle, one of whose vertices 
is the image of the point at infinity. 

(b) In the case A = —1, the image of 



vc(c 2 - 1) 


is a square whose side lengths are 


r 2 d/4) 

2\/27r 


21. We consider conformal mappings to triangles. 


(a) Show that 



z)~ P2 dz, 


with 0 < /3i < 1 , 0 < /?2 < 1 , and 1 < /3i + 你 < 2 , maps H to a triangle 
whose vertices are the images of 0 , 1 , and 00 , and with angles ai 7 r, 0 ； 2 丌 , 
and q ： 37 t, where aj -\- (3j = 1 and /3i + 决 + = 2 . 

(b) What happens when /3i + /?2 = 1? 

(c) What happens when 0 < /3i + /?2 < 1? 

(d) In (a), the length of the side of the triangle opposite angle ajn is 

^^r(a 1 )r(a 2 )r(a 3 ). 


22. If P is a simply connected region bounded by a polygon with vertices ai,..., a n 
and angles ai 7 r,..., a n 7r, and F is a conformal map of the disc D to P, then there 
exist complex numbers B±,..., B n on the unit circle, and constants ci and C 2 so 
that 


F(z) = a 



_<_ 

(C - • (C - B n )^ 


+ C2. 


[Hint: This follows from the standard correspondence between H and D and an 
argument similar to that used in the proof of Theorem 4.7.] 


23. If 





dC 

(1 — ^)2/n 
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then F maps the unit disc conformally onto the interior of a regular polygon with 
n sides and perimeter 




(sin6>) 


-2/n 


de. 


24. The elliptic integrals K and K' defined for 0 < A: < 1 by 



dx 

((i - x 2 )(i - k 2 x 2 )y / 2 


and 



dx 

(( x 2 - 1 )( 1 - Fx 2 )) 1 /2 


satisfy various interesting identities. For instance: 

(a) Show that \ik 2 = l — k 2 and 0 < k < 1, then 

K\k) = K(k). 

[Hint: Change variables x = (1 — 石 2 y 2 ) _ " 2 in the integral defining K\k).] 

(b) Prove that k 2 = 1 — k 2 , and 0 < ^ < 1, then 


K[k) 


^rK\ 


l-\- k 


1 + /c > 


[Hint: Change variables x = 2t/(l k (1 — k)t 2 ).] 

(c) Show that for 0 < A; < 1 one has 


K(k) = ^F(l/2,l/2,l-k 2 ), 


where F the hypergeometric series. [Hint: This follows from the integral 
representation for F given in Exercise 9, Chapter 6.] 


6 Problems 

1. Let / be a complex-valued C 1 function defined in the neighborhood of a point 
zq. There are several notions closely related to conformality at zq. We say that 
f is isogonal at zo if whenever 7 ⑷ and rj(t) are two smooth curves with 7 ( 0 )= 
77 ( 0 ) = zo, that make an angle 6 there (| 沒 | < 7r), then /(7(t)) and make an 

angle of at t = 0 with | 沒 ’| = |0| for all 6. Also, / is said to be isotropic if it 
magnifies lengths by some factor for all directions emanating from zo, that is, if 
the limit 

lim \f(zo+re ie ) - f(z 0 )\ 
r —0 r 

exists, is non-zero, and independent of 6. 
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Then / is isogonal at zo if and only if it is isotropic at zo\ moreover, / is isogonal 
at zo if and only if either f^zo) exists and is non-zero, or the same holds for / 
replaced by /. 

2. The angle between two non-zero complex numbers z and w (taken in that order) 
is simply the oriented angle, in (—7r, 7r], that is formed between the two vectors in 
R 2 corresponding to the points 之 and w. This oriented angle, say a, is uniquely 
determined by the two quantities 

(g.w) (z, -iw) 

NH "MR" 

which are simply the cosine and sine of a, respectively. Here, the notation 
corresponds to the usual Euclidian inner product in R 2 , which in terms of complex 
numbers takes the form (z,w) = Re( 2 ： wJ). 

In particular, we may now consider two smooth curves 7 : [a, 6] —>• C and r\ : 
[a, b] —>• C, that intersect at zo, say 7 (^ 0 ) = v(^o) = :o，for some to G (a, b). If the 
quantities 7’ (to) and rj'(to) are non-zero, then they represent the tangents to the 
curves 7 and 77 at the point zo, and we say that the two curves intersect at zo at 
the angle formed by the two vectors 7’ (to) and ^(to). 

A holomorphic function / defined near zo is said to preserve angles at zo if 
for any two smooth curves 7 and r] intersecting at zo, the angle formed between 
the curves 7 and 77 at zq equals the angle formed between the curves / o 7 and 
/ o 77 at f (zo) . (See Figure 12 for an illustration.) In particular, we assume that 
the tangents to the curves 7, 77 , / o 7, and / o 77 at the point zo and f(zo) are all 
non-zero. 



Figure 12. Preservation of angles at zq 


(a) Prove that if / : > C is holomorphic, and f\zo) ^ 0 , then / preserves 

angles at zq. [Hint: Observe that 

U\ z o)l\to)j\zo)r}' (to)) = |/ / ( 2 0 )| 2 ( 7 ， (^o),^ ， (to)).] 

(b) Conversely, prove the following: suppose / : Q ^ C is a complex-valued 
function, that is real-differentiable at zo G and Jf(zo) # 0 . If / preserves 
angles at zo, then / is holomorphic at zo with f {zq) ^ 0. 





256 


Chapter 8. CONFORMAL MAPPINGS 


3.* The Schwarz-Pick lemma (see Exercise 13) is the infinitesimal version of an 
important observation in complex analysis and geometry. 

For complex numbers w E C and z G O we define the hyperbolic length of w 
at 之 by 




where \w\ and \z\ denote the usual absolute values. This length is sometimes 
referred to as the Poincare metric, and as a Riemann metric it is written as 


ds 2 = 


\dz \ 2 

(l-l^l 2 ) 2 ' 


The idea is to think of k; as a vector lying in the tangent space at Observe that 
for a fixed w, its hyperbolic length grows to infinity as z approaches the boundary 
of the disc. We pass from the infinitesimal hyperbolic length of tangent vectors to 
the global hyperbolic distance between two points by integration. 

(a) Given two complex numbers z\ and Z2 in the disc, we define the hyperbolic 
distance between them by 


伞 1, 之 2) = inf / ||7’ ⑴ " 飞⑴成 

7 Jo 

where the infimum is taken over all smooth curves 7 : [0,1] B joining z± 
and Z2. Use the Schwarz-Pick lemma to prove that if / : D ^ ID is holomor- 
phic, then 


d(f(zi),f(z2)) < d(zi,z 2 ) for any zi,z 2 G D. 


In other words, holomorphic functions are distance-decreasing in the hyper¬ 
bolic metric. 

(b) Prove that automorphisms of the unit disc preserve the hyperbolic distance, 
namely 


d((p(zi), (p(Z2)) = d(zi, z 2 ), for any zi,Z2 G D 

and any automorphism ip. Conversely, if : D —> D preserves the hyperbolic 
distance, then either ^ or ^ is an automorphism of D. 

(c) Given two points zi, Z2 G O, show that there exists an automorphism such 
that (p(zi) = 0 and ^(^2) = s for some s on the segment [0,1) on the real 
line. 

(d) Prove that the hyperbolic distance between 0 and s G [0,1) is 

d{0,s) = 
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(e) Find a formula for the hyperbolic distance between any two points in the 
unit disc. 


4.* Consider the group of matrices of the form 



that satisfy the following conditions: 

(i) a, 6, c, and d E C, 

(ii) the determinant of M is equal to 1, 

(iii) the matrix M preserves the following hermitian form on C 2 X C 2 : 


(Z, W) = ZlWl — Z2W2, 


where Z = (zi, Z 2 ) and W = (wi,W 2 ). In other words, for all Z, W G C 2 


(MZ, MW) = {Z, W). 


This group of matrices is denoted by SU(1,1). 


(a) Prove that all matrices in SU(1,1) are of the form 



where |a| 2 — \b\ 2 = 1. To do so, consider the matrix 



and observe that (Z, W) = *W JZ, where W denotes the conjugate trans¬ 
pose of W. 

(b) To every matrix in SU(1,1) we can associate a fractional linear transforma¬ 
tion 


az -\-b 
cz -\- d 


Prove that the group SU(1, 1)/{ 士 1} is isomorphic to the group of automor¬ 
phisms of the disc. [Hint: Use the following association.] 
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5. The following result is relevant to Problem 4 in Chapter 10 which treats modular 
functions. 

(a) Suppose that F : H —> C is holomorphic and bounded. Also, suppose that 

F(z) vanishes when 2 : = ir n , n = 1, 2,3,..where {r n } is a bounded se¬ 
quence of positive numbers. Prove that if r n = 00 , then F = 0. 

(b) If r n < 00 , it is possible to construct a bounded function on the upper 
half-plane with zeros precisely at the points ir n . 

For related results in the unit disc, see Problems 1 and 2 in Chapter 5.] 


6 .* The results of Exercise 18 extend to the case when 7 is assumed merely to be 
closed, simple, and continuous. The proof, however, requires further ideas. 

7 * Applying ideas of Caratheodory, Koebe gave a proof of the Riemann mapping 
theorem by constructing (more explicitly) a sequence of functions that converges 
to the desired conformal map. 

Starting with a Koebe domain, that is, a simply connected domain /Co C D that 
is not all of D, and which contains the origin, the strategy is to find an injective 
function /o such that fo(JCo) = /Ci is a Koebe domain “larger” than /Co- Then, one 
iterates this process, finally obtaining functions F n = f n o ... o fo : /Co 奶 such 
that Fn(JCo) = JCn+i and lim F n = F is a, conformal map from /Co to D. 

The inner radius of a region /C C O that contains the origin is defined by 
rjc = sup {/9 > 0 : D(0, p) C K}. Also, a holomorphic injection / : /C —>• D is said to 
be an expansion if /( 0 ) = 0 and |/(^)| > \z\ for all z E JC — { 0 }. 


(a) Prove that if / is an expansion, then rf^) ^ r K. and |/’(0)| > 1. [Hint: 
Write f(z) = zg(z) and use the maximum principle to prove that |/’(0)| = 

I" ⑼ 1 > 1.] 


Suppose we begin with a Koebe domain /Co and a sequence of expansions 

{/ 0 , / 1 ， ... ， /n, ...}, so that JCn+i = fn(JC n ) are also Koebe domains. We then 

define holomorphic maps F n : /Co —> O by = / n o • • • o / 0 . 

(b) Prove that for each n, the function F n is an expansion. Moreover, 

^n(O) = n2=o/((0)， an d conclude that lim n —oo |/n(0)| = 1. [Hint: Prove 
that the sequence {|i^(0)|} has a limit by showing that it is bounded above 
and monotone increasing. Use the Schwarz lemma.] 

(c) Show that if the sequence is osculating, that is, rjc n — > 1 as n — 00 , then 
{F n } converges uniformly on compact subsets of /Co to a conformal map 
F : /Co ^ [Hint: If Tf{k q ) ^ 1 then F is surjective.] 

To construct the desired osculating sequence we shall use the automorphisms 
寸 oc = (a - z)/(l -az). 

(d) Given a Koebe domain /C, choose a point qj G B on the boundary of JC such 
that |a| = rjc, and also choose /3 € P such that /3 2 = a. Let S denote the 
square root of 心 on /C such that 5(0) = 0. Why is such a function well 
defined? Prove that the function / : /C ^ D defined by f(z) = o S o 
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is an expansion. Moreover, show that |/’(0)| = (1 + r；c)/2- v /rjc. [Hint: To 
prove that |/(^)| > \z\ on JC — {0} apply the Schwarz lemma to the inverse 
function, namely 'ipoi ° g ° ipp where g(z) = z 2 .] 

(e) Use part (d) to construct the desired sequence. 


8.* Let / be an injective holomorphic function in the unit disc, with /(0) = 0 and 
f\0) = 1. If we write f(z) = z a 2 Z 2 + a 3 Z 3 • •., then Problem 1 in Chapter 3 
shows that \a 2 \ < 2. Bieberbach conjectured that in fact \a n \ < n for all n > 2; 
this was proved by deBranges. This problem outlines an argument to prove the 
conjecture under the additional assumption that the coefficients a n are real. 

(a) Let 2 = re x6 with 0 < r < 1, and show that if v(r, 6) denotes the imaginary 
part of f(re t0 ), then 




v(r,0) sin nO d6. 


(b) Show that for 0 < ^ < 7r and n = 1， 2,... we have | sin n6\ < n sin 6. 

(c) Use the fact that a n G M to show that /(B) is symmetric with respect to 
the real axis, and use this fact to show that / maps the upper half-disc into 
either the upper or lower part of /(D). 

(d) Show that for r small, 


v(r,6) = rsin 沒 [1 + O(r)], 

and use the previous part to conclude that v(r, 6) sin^ > 0 for all 0 < r < 1 
and 0 < 6 < n. 

(e) Prove that |a n r n | < nr, and let r —^ 1 to conclude that \a n \ < n. 

(f) Check that the function f(z) = z/(l — z) 2 satisfies all the hypotheses and 
that \a n \ = n for all n. 


9.* Gauss found a connection between elliptic integrals and the familiar operations 
of forming arithmetic and geometric means. 

We start with any pair (a, b) of numbers that satisfy a > 6 > 0, and form the 
arithmetic and geometric means of a and 6, that is, 

ai = a : & and b\ = (ab) 1 ^ 2 . 


We then repeat these operations with a and b replaced by ai and b\. Iterating 
this process provides two sequences {a n } and {b n } where a n -\-i and 6 n +i are the 
arithmetic and geometric means of a n and b n , respectively. 
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(a) Prove that the two sequences {a n } and {b n } have a common limit. This 
limit, which we denote by M(a, 6), is called the arithmetic-geometric 
mean of a and b. [Hint: Show that a > a± > a 2 > • — > a n > b n > — • > 
b\ > b and a n — b n < (a — b)/2 n .] 

(b) Gauss’s identity states that 


M(a,6) 



_ d9 _ 

(a 2 cos 2 6 -\-b 2 sin 2 0) 1 / 2 


To prove this relation, show that if 7(a, b) denotes the integral on the right- 
hand side, then it suffices to establish the invariance of /, namely 

(6) I(a,b) = I ■ 

Then, observe that the connection with elliptic integrals takes the form 

I(a,b) = —K(k) = — I — , ( 丨 1 = where k 2 = 1 — b 2 /a 2 , 

a a J 0 ^/(l-x 2 )(l-k 2 x 2 ) 


and that the relation (6) is a consequence of the identity in (b) of Exercise 24. 









An Introduction to Elliptic 
Functions 


The form that Jacobi had given to the theory of elliptic 
functions was far from perfection; its flaws are obvious. 
At the base we find three fundamental functions sn, 
cn and dn. These functions do not have the same 
periods... 

In Weierstrass’ system, instead of three funda¬ 
mental functions, there is only one, p(u), and it is the 
simplest of all having the same periods. It has only 
one double infinity; and finally its definition is so that 
it does not change when one replaces one system of 
periods by another equivalent system. 

H. Poincare, 1899 


The theory of elliptic functions, which is of interest in several parts of 
mathematics, initially grew out of the study of elliptic integrals. These 
can be described generally as integrals of the form f i?(x, yP(x)) dx, 
where i? is a rational function and P a polynomial of degree three or 
four. 1 These integrals arose in computing the arc-length of an ellipse, or 
of a lemniscate, and in a variety of other problems. Their early study was 
centered on their special transformation properties and on the discovery 
of an inherent double-periodicity. We have seen an example of this latter 
phenomenon in the mapping function of the half-plane to a rectangle 
taken up in Section 4.5 of the previous chapter. 

It was Jacobi who transformed the subject by initiating the systematic 
study of doubly-periodic functions (called elliptic functions). In this the¬ 
ory, the theta functions he introduced played a decisive role. Weierstrass 
after him developed another approach, which in its initial steps is simpler 
and more elegant. It is based on his p function, and in this chapter we 
shall sketch the beginnings of that theory. We will go as far as to glimpse 
a possible connection with number theory, by considering the Eisenstein 
series and their expression involving divisor functions. A number of more 
direct links with combinatorics and number theory arise from the theta 


1 The case when P is a quadratic polynomial is essentially that of “circular functions”, 
and can be reduced to the trigonometric functions sin re, cos x, etc. 
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functions, which we will take up in the next chapter. The remarkable 
facts we shall see there attest to the great interest of these functions in 
mathematics. As such they ought to soften the harsh opinion expressed 
above about the imperfection of Jacobi’s theory. 

1 Elliptic functions 

We are interested in meromorphic functions / on C that have two periods; 
that is, there are two non-zero complex numbers cji and lo] such that 


/(z + u^) = f(z) and f(z + uj 2 ) = f{z), 

for all 2 : G C. A function with two periods is said to be doubly periodic. 

The case when and tU 2 are linearly dependent over M, that is 
^ 2/^1 ^ is uninteresting. Indeed, Exercise 1 shows that in this case / 
is either periodic with a simple period (if the quotient ^ 2/^1 is rational) 
or / is constant (if (^ 2/^1 is irrational). Therefore, we make the following 
assumption: the periods uj\ and u ：2 are linearly independent over R. 

We now describe a normalization that we shall use extensively in this 
chapter. Let r = ^/^l- Since r and 1/r have imaginary parts of oppo¬ 
site signs, and since r is not real, we may assume (after possibly inter¬ 
changing the roles of and U 2 ) that Im(r) > 0. Observe now that the 
function / has periods uj\ and cj 2 if and only if the function F(z) = f(cuiz) 
has periods 1 and r, and moreover, the function / is meromorphic if and 
only if F is meromorphic. Also the properties of / are immediately 
deducible from those of F. We may therefore assume, without loss of 
generality, that / is a meromorphic function on C with periods 1 and r 
where Im(r) > 0. 

Successive applications of the periodicity conditions yield 
(1) f(z + n + mr) = f(z) for all integers n, m and all 2 ： G C, 
and it is therefore natural to consider the lattice in C defined by 
A = {n + mr : n, m G Z}. 

We say that 1 and r generate A (see Figure 1). 

Equation (1) says that / is constant under translations by elements 
of A. Associated to the lattice A is the fundamental parallelogram 
defined by 


Po = {z E C : z = a br where 0 < a < 1 and 0 < 6 < 1}. 
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Figure 1 . The lattice A generated by 1 and r 


The importance of the fundamental parallelogram comes from the fact 
that / is completely determined by its behavior on Po- To see this, we 
need a definition: two complex numbers 2 ： and w are congruent modulo 
A if 


z = w -\- n-\- rriT for some n,m G Z, 

and we write z ~ w. In other words, z and w differ by a point in the 
lattice, z — w E A. By (1) we conclude that f(z) = f(w) whenever z ~ w. 
If we can show that any point in 2 ： G C is congruent to a unique point in 
Po then we will have proved that / is completely determined by its values 
in the fundamental parallelogram. Suppose z = x + iy is given, and write 
z = a-\- br where a, 6 G K.. This is possible since 1 and r form a basis over 
the reals of the two-dimensional vector space C. Then choose n and m to 
be the greatest integers < a and < 6, respectively. If we let -w; = 2 ; — n — 
tut, then by definition z ~ w, and moreover w = (a — n) (b — m)r. By 
construction, it is clear that w G Po- To prove uniqueness, suppose that 
w and w r are two points in Pq that are congruent. If we write w = a br 
and w r = a! -\- b’T, then w — w f = (a — a f ) (b — b r )r G A, and therefore 
both a — a! and b — b’ are integers. But since 0 < a, a 7 < 1, we have 
— l<a — a / <l, which then implies a — a f = 0. Similarly b — b’ = 0, and 
we conclude that w = w f . 

More generally, a period parallelogram P is any translate of the 
fundamental parallelogram, P = Po h with /i G C (see Figure 2). 

Since we can apply the lemma to z — h, we conclude that every point 
in C is congruent to a unique point in a given period parallelogram. 
Therefore, / is uniquely determined by its behavior on any period par¬ 
allelogram. 
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h-\- T 



Finally, note that A and Po give rise to a covering (or tiling) of the 
complex plane 

(2) C = |^J (n + mr + Po), 

n,mGZ 

and moreover, this union is disjoint. This is immediate from the facts 
we just collected and the definition of Pq. We summarize what we have 
seen so far. 

Proposition 1.1 Suppose f is a meromorphic function with two periods 
1 and r which generate the lattice A. Then: 

(i) Every point in C is congruent to a unique point in the fundamental 
parallelogram. 

(ii) Every point in C is congruent to a unique point in any given period 
parallelogram. 

(iii) The lattice A provides a disjoint covering of the complex plane, in 
the sense of (2). 

(iv) The function f is completely determined by its values in any period 
parallelogram. 

1.1 Liouville’s theorems 

We can now see why we assumed from the beginning that / is meromor¬ 
phic rather than just holomorphic. 

Theorem 1.2 An entire doubly periodic function is constant. 

Proof. The function is completely determined by its values on Po 
and since the closure of Pq is compact, we conclude that the function is 
bounded on C, hence constant by Liouville’s theorem in Chapter 2. 
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A non-constant doubly periodic meromorphic function is called an el¬ 
liptic function. Since a meromorphic function can have only finitely 
many zeros and poles in any large disc, we see that an elliptic function 
will have only finitely many zeros and poles in any given period parallel¬ 
ogram, and in particular, this is true in the fundamental parallelogram. 
Of course, nothing excludes / from having a pole or zero on the boundary 

of P 0 . 

As usual, we count poles and zeros with multiplicities. Keeping this 
in mind we can prove the following theorem. 

Theorem 1.3 The total number of poles of an elliptic function in Pq is 
always > 2. 

In other words, f cannot have only one simple pole. It must have at 
least two poles, and this does not exclude the case of a single pole of 
multiplicity > 2. 

Proof. Suppose first that / has no poles on the boundary dPo of the 
fundamental parallelogram. By the residue theorem we have 

[f(z)dz = 2?rz V]res/, 

JdP 0 

and we contend that the integral is 0. To see this, we simply use the 
periodicity of /. Note that 

r /*1 /*l+r pr pO 

/ f(z)dz 二 I f(z)dz+ / f(z)dz+ / f(z)dz+ / f(z) dz, 
J 3Pq J 0 j 1 j 1+T j T 

and the integrals over opposite sides cancel out. For instance 



f(z)dz + 



f(z)dz 



f(z) dz 
f(z)dz 
f{z) dz 



f[s + T)ds 

f(s)ds 

f(z)dz 


二 0 


and similarly for the other pair of sides. Hence f gPo f — 0 and ^ res/ = 
0. Therefore / must have at least two poles in Pq. 

If / has a pole on dPo choose a small ft G C so that if P = /i + Po ? 
then / has no poles on dP. Arguing as before, we find that / must have 
at least two poles in P, and therefore the same conclusion holds for Pq. 
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The total number of poles (counted according to their multiplicities) 
of an elliptic function is called its order. The next theorem says that 
elliptic functions have as many zeros as they have poles, if the zeros are 
counted with their multiplicities. 

Theorem 1.4 Every elliptic function of order m has m zeros in Pq. 

Proof. Assuming first that / has no zeros or poles on the boundary 
of Pq? we know by the argument principle in Chapter 3 that 



where and A/"p denote the number of zeros and poles of / in Po? 
respectively. By periodicity, we can argue as in the proof of the previous 
theorem to find that f dp 。/’// = 0, and therefore A/* 3 = J\f p . 

In the case when a pole or zero of / lies on dPo it suffices to apply the 
argument to a translate of P. 

As a consequence, if / is elliptic then the equation f(z) = c has as 
many solutions as the order of / for every c G C, simply because f — c 
is elliptic and has as many poles as /. 

Despite the rather simple nature of the theorems above, there remains 
the question of showing that elliptic functions exist. We now turn to a 
constructive solution of this problem. 

1.2 The Weierstrass p function 
An elliptic function of order two 

This section is devoted to the basic example of an elliptic function. As 
we have seen above, any elliptic function must have at least two poles; 
we shall in fact construct one whose only singularity will be a double 
pole at the points of the lattice generated by the periods. 

Before looking at the case of doubly-periodic functions, let us first 
consider briefly functions with only a single period. If one wished to 
construct a function with period 1 and poles at all the integers, a simple 
choice would be the sum 



Note that the sum remains unchanged if we replace z by z 1^ and the 
poles are at the integers. However, the series defining F is not absolutely 
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convergent, and to remedy this problem, we sum symmetrically, that is, 
we define 


F(z) = lim V - 

N—^oo z + n 
M<n 



z -\-n 



On the far right-hand side, we have paired up the terms corresponding 
to n and —n, a trick which makes the quantity in brackets 0(l/n 2 ), 
and hence the last sum is absolutely convergent. As a consequence, F 
is meromorphic with poles precisely at the integers. In fact, we proved 
earlier in Chapter 5 that F(z) = 7rcot 7rz. 

There is a second way to deal with the series ^2°^^ 1/(2 ： + n), which 
is to write it as 

1 T 1 1" 

- h 〉 - , 

% z -\- n n 

n^O 1 - 」 

where the sum is taken over all non-zero integers. Notice that l/(z n) — 
l/n= 0(l/n 2 ), which makes this series absolutely convergent. More¬ 
over, since 



we get the same sum as before. 

In analogy to this, the idea is to mimic the above to produce our first 
example of an elliptic function. We would like to write it as 

(z + cj ) 2 ， 
ueA v ’ 


but again this series does not converge absolutely. There are several 
approaches to try to make sense of this series (see Problem 1), but the 
simplest is to follow the second way we dealt with the cotangent series. 

To overcome the non-absolute convergence of the series, let A* de¬ 
note the lattice minus the origin, that is, A* = A — {(0,0)}, and consider 
instead the following series: 

1 V [ 1 1 " 

Z 2 (z + cj) 2 LJ 2 ’ 

u ； eA* LV ’ 」 

where we have subtracted the factor 1 /cj 2 to make the sum converge. 
The term in brackets is now 

1 1 — — 2,zcu 

(z + cj) 2 uj 2 {z + u) 2 uj 2 


as M 


oo, 


LU° 
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and the new series will define a meromorphic function with the desired 
poles once we have proved the following lemma. 


Lemma 1.5 The two series 

1 


E 


(n,m)^(0,0) 


(|n| + |m|) r 


and 


E 


1 


n+mrGA* 


n + mr\ 


converge if r > 2. 


Recall that according to the Note at the end of Chapter 7, the question 
whether a double series converges absolutely is independent of the order 
of summation. In the present case, we shall first sum in m and then in n. 

For the first series, the usual integral comparison can be applied. 2 For 
each n 7 ^ 0 


E 


1 \~~> 


mez (1^1 + \m\) r \n\ r ^ (|n| + \m\) r 


<r-u ： + 2 


+ 2 E 

fe>|n|+l 

dx 


k r 




X r 


< —+ C 7 


lr-1 _ 


Therefore, r > 2 implies 

1 


E 




( W + |m|)r 1^0 |m|r ' |^o^ (W + H) r 


| m |#0 
< OO. 


E 




+ C- 


n 


r—l 


To prove that the second series also converges, it suffices to show that 
there is a constant c such that 

|n| + \m\ < c\n + rm\ for all n, m G Z. 


2 We simply use l/k r < l/x r when k — 1 < x < k; see also the first figure in Chapter 8, 
Book I. 
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We use the notation x ^ y if there exists a positive constant a such that 
x < ay. We also write x ^ y if both x < y and y ^ x hold. Note that for 
any two positive numbers A and B, one has 

{A 2 + B 2 ) 1 / 2 ^ A + B. 

On the one hand A < (A 2 + B 2 ) 1 ^ 2 and B < (A 2 + 5 2 ) 1//2 , so that 
A-\- B < 2(A 2 + B 2 ) 1 / 2 . On the other hand, it suffices to square both 
sides to see that (A 2 + B 2 ) 1 ^ 2 < A-\- B. 

The proof that the second series in Lemma 1.5 converges is now a 
consequence of the following observation: 

|n| + \m\ 叫 n + mr\ whenever r G H. 

Indeed, \i r = s-\-it with s,t G M and t > 0, then 

\n + mr\ = [(n + ms) 2 + (mt) 2 ] 1 ^ 2 « |n + ms\ + \mt\ « |n + ms\ + |m|, 

by the previous observation. Then, \n + ms\ + \m\ ^ |n| + |m|, by con¬ 
sidering separately the cases when |n| < 2 \m\ |s| and |n| > 2 \m\ 丨外 

Remark. The proof above shows that when r > 2 the series 
|n + mr\~ r converges uniformly in every half-plane Im(r) > 5 > 0. 

In contrast, when r = 2 this series fails to converge (Exercise 3). 


With this technical point behind us, we may now return to the defini¬ 
tion of the Weierstrass p function, which is given by the series 


p(z) 


E 


jeA* 


(z + cu) 2 CJ 2 




E 


(n,m)^(0,0) 


(z + n + mr) 2 (n + mr) 2 


We claim that p is a meromorphic function with double poles at the 
lattice points. To see this, suppose that | 之 | < i?, and write 


p ⑺ 


E 

|a;|<2i? 






E 

|u;|>2i? 






The term in the second sum is 0(l/|u;| 3 ) uniformly for \z\ < i?, so by 
Lemma 1.5 this second sum defines a holomorphic function in \z\ < R. 
Finally, note that the first sum exhibits double poles at the lattice points 
in the disc | 2 ：| < R. 
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Observe that because of the insertion of the terms — l/o; 2 , it is no 
longer obvious whether p is doubly periodic. Nevertheless this is true, 
and p has all the properties of an elliptic function of order 2. We gather 
this result in a theorem. 

Theorem 1.6 The function p is an elliptic function that has periods 1 
and r, and double poles at the lattice points. 

Proof. It remains only to prove that p is periodic with the correct 
periods. To do so, note that the derivative is given by differentiating the 
series for p termwise so 


p'( z ) = - 2 


{z mr) 3 


n,mGZ 


This accomplishes two things for us. First, the differentiated series con¬ 
verges absolutely whenever 2 ： is not a lattice point, by the case r = 3 of 
Lemma 1.5. Second, the differentiation also eliminates the subtraction 
term 1 /cj 2 ; therefore the series for 〆 is clearly periodic with periods 1 
and r, since it remains unchanged after replacing z by z -\- 1 or z + r. 

Hence, there are two constants a and b such that 


p{z + 1) = p(z) + a and p(z + r) = p(z) + b. 


It is clear from the definition, however, that p is even, that is, p(z)= 
p(—z), since the sum over a; G A can be replaced by the sum over —cu G 
A. Therefore p(—1/2) = p(l/2) and p(—r/2) = p(r/2), and setting 2 ：= 
— 1/2 and 2 : = —r/2, respectively, in the two expressions above proves 
that a = b = 0. 

A direct proof of the periodicity of p can be given without differenti¬ 
ation; see Exercise 4. 

Properties of p 

Several remarks are in order. First, we have already observed that p is 
even, and therefore p r is odd. Since p’ is also periodic with periods 1 
and r, we find that 



Indeed, one has, for example, 

p'(l/2) = — p '(—1/2) 二 -p'(—1/2+ 1) 奪 —p'(l/2). 
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Since p' is elliptic and has order 3, the three points 1/2, r/2, and 
(1 + t)/ 2 (which are called the half-periods) are the only roots of p’ in 
the fundamental parallelogram, and they have multiplicity 1. Therefore, 
if we define 



we conclude that the equation p(z) = e\ has a double root at 1/2. Since 
p has order 2, there are no other solutions to the equation p(z) = e\ in 
the fundamental parallelogram. Similarly the equations p(z) = e 2 and 
p(z) = es have only double roots at r/2 and (1 + r)/2, respectively. In 
particular, the three numbers ei,e 2 , and es are distinct, for otherwise 
p would have at least four roots in the fundamental parallelogram, con¬ 
tradicting the fact that p has order 2. From these observations we can 
prove the following theorem. 

Theorem 1.7 The function (p f ) 2 is the cubic polynomial in p 


(pO 2 = 4 (P~ ei)(p~ e 2 )(p- e 3 ). 


Proof. The only roots of F(z) = (p(z) — ei)(p(z) — e 2 )(p( 2 ：) — e^) in 
the fundamental parallelogram have multiplicity 2 and are at the points 
1/2,r/2, and (1 + r)/2. Also, (p’) 2 has double roots at these points. 
Moreover, F has poles of order 6 at the lattice points, and so does (p’) 2 
(because p' has poles of order 3 there). Consequently {p r ) 2 /F is holo- 
morphic and still doubly-periodic, hence this quotient is constant. To 
find the value of this constant we note that for z near 0, one has 




and p\z) 


p(^) 


where the dots indicate terms of higher order. Therefore the constant 
is 4, and the theorem is proved. 

We next demonstrate the universality of p by showing that every el¬ 
liptic function is a simple combination of p and p f . 

Theorem 1.8 Every elliptic function f with periods 1 and 丁 is a rational 
function of p and p’. 

The theorem will be an easy consequence of the following version of it. 


Lemma 1.9 Every even elliptic function F with periods 1 and r is a 
rational funcion of p. 
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Proof. If F has a zero or pole at the origin it must be of even order, 
since F is an even function. As a consequence, there exists an integer m 
so that Fp m has no zero or pole at the lattice points. We may therefore 
assume that F itself has no zero or pole on A. 

Our immediate goal is to use p to construct a doubly-periodic function 
G with precisely the same zeros and poles as F. To achieve this, we recall 
that p(z) — p(a) has a single zero of order 2 if a is a half-period, and two 
distinct zeros at a and —a otherwise. We must therefore carefully count 
the zeros and poles of F. 

If a is a zero of F, then so is —a, since F is even. Moreover, a is 
congruent to —a if and only if it is a half-period, in which case the zero 
is of even order. Therefore, if the points ai, —ai, … ， a m , — a m counted 
with multiplicities 3 describe all the zeros of F, then 

[P(2) — p(o!)] - - - [p(z) - p(a m )] 

has precisely the same roots as F. A similar argument, where 
6i, —6i,..., , —bm (with multiplicities) describe all the poles of F, then 

shows that 

N = [P(2p — … [p(z) — p(a. m )] 

[p( z ) ~ P(M] ... [p( z ) ~ Pipm)] 

is periodic and has the same zeros and poles as F. Therefore, F/G is 
holomorphic and doubly-periodic, hence constant. This concludes the 
proof of the lemma. 

To prove the theorem, we first recall that p is even while p' odd. We 
then write / as a sum of an even and an odd function, 

f(z ) 二 /even(^) + /odd(2 )， 

where in fact 

/ e - ㈤ 二 ’⑷ \ f{ ~ Z) and /- ( 和 ’ ⑷ - 2 /( - Z) . 

Then, since / 0 dd/p / is even, it is clear from the lemma applied to / even 
and fodd/p f that / is a rational function of p and p / . 


3 If aj is not a half-period, then aj and —aj have the multiplicity of F at these points. 
If aj is a half-period, then aj and —aj are congruent and each has multiplicity half of the 
multiplicity of F at this point. 
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2 The modular character of elliptic functions and Eisen¬ 
stein series 

We shall now study the modular character of elliptic functions, that is, 
their dependence on t. 

Recall the normalization we made at the beginning of the chapter. We 
started with two periods and UJ 2 linearly that are independent over M, 
and we defined t = 的 /…. We could then assume that Im(r) > 0, and 
also that the two periods are 1 and r. Next, we considered the lattice 
generated by 1 and r and constructed the function p, which is elliptic of 
order 2 with periods 1 and r. Since the construction of p depends on r, 
we could write p T instead. This leads us to change our point of view and 
think of pr(^) primarily as a function of r. This approach yields many 
interesting new insights. 

Our considerations are guided by the following observations. First, 
since 1 and r generate the periods of p T (z), and 1 and r + 1 generate 
the same periods, we can expect a close relationship between p T (z) and 
p T+ i{z). In fact, it is easy to see that they are identical. Second, since 
r = by the normalization imposed at the beginning of Section 1, 

we see that —1/r = —iJi/iU 2 (with Im(—1/r) > 0). This corresponds 
essentially to an interchange of the two periods (jO\ and a ； 2 , and thus we 
can also expect an intimate connection between p T and p-i/ T . In fact, 
it is easy to verify that p_i/ T (z) = r 2 p T {rz). 

So we are led to consider the group of transformations of the upper half¬ 
plane Im(r) > 0, generated by the two transformations r i—^ r + 1 and 
t h — 1/t. This group is called the modular group. On the basis of 
what we said, it can be expected that all quantities intrinsically attached 
to p T (z) reflect the above transformations. We see this clearly when we 
consider the Eisenstein series. 


2.1 Eisenstein series 

The Eisenstein series of order k is defined by 


E k ( 丁 ) 


E 

(n,m)#(0,0) 


1 

(n + mr) k 


whenever k is an integer > 3 and r is a complex number with Im(r) > 0. 
If A is the lattice generated by 1 and r, and if we write uj = n-\- rrvr, 
then another expression for the Eisenstein series is l/cu k . 

Theorem 2.1 Eisenstein series have the following properties: 
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(i) The series Ek ( 丁 ) converges if k > 3, and is holomorphic in the 
upper half-plane. 

(ii) Ek ( 丁 ) = 0 if k is odd. 

(iii) Ek(r) satisfies the following transformation relations: 

E k (r + 1 ) = E k (j) and E k (r) = T~ k E k (-l/T). 

The last property is sometimes referred to as the modular character 
of the Eisenstein series. We shall return to these and other modular 
identities in the next chapter. 

Proof. By Lemma 1.5 and the remark after it, the series Ek(r) 
converges absolutely and uniformly in every half-plane Im(r) > 5 > 0, 
whenever A: > 3; hence Ek{r) is holomorphic in the upper half-plane 
Im(r) > 0. 

By symmetry, replacing n and m by —n and — m, we see that whenever 
k is odd the Eisenstein series is identically zero. 

Finally, the fact that E^r) is periodic of period 1 is clear from the fact 
that n + m(r + 1) = n + m + mr, and that we can rearrange the sum by 
replacing n + m by n. Also, we have 


(n + m(-l/r)) h 


z {nr 


and again we can rearrange the sum, this time replacing (—m, n) by 
(n, m). Conclusion (iii) then follows. 

Remark. Because of the second property, some authors define the 
Eisenstein series of order k to be m)^(o o) V( n + rnr) 2k , possibly 
also with a constant factor in front. 

The connection of the with the Weierstrass p function arises when 
we investigate the series expansion of p near 0. 

Theorem 2.2 For z near 0 ， we have 

p(^) = + 3£^42： 2 + 5EqZ 4 + ... 


+ (2fc + l)E2k-\- 


2 之 


Jlk 


Proof. From the definition of p, if we note that we may replace uj by 


—uj without changing the sum, we have 


p ㈤ 


+ E 

o ； €A* 


(z + cu ) 2 UJ 2 


~+E 

u ； eA* 


( 卜 ㈧ 2 
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where cu = n mr. The identity 

v J ^=Q 

which follows from differentiating the geometric series, implies that for 
all small 2 : 

1 1 r 、 （ z\t 1 1 r 、 （ z\i 

v ’ £=0 i=\ 

Therefore 

u ； eA* e=i 

= ^ + J+^\ z " 

£=1 \ujEA* / 

^ +y~i (^+ 1 )^+ 2 ^ 

£=1 

1 °° 

= 3 ( 2 々 + 1 ) 五 2 fc +2 么 2fc ， 

k=l 

where we have used the fact that 及 +2 = 0 whenever £ is odd. 

From this theorem, we obtain the following three expansions for 2 ： 
near 0: 

p’(z) = —— + 6E^z + 20Eqz^ + .. • ， 

2 ：^ 

( 〆 ㈤ ) 2 = ^ —^ _80£； 6 + ... ， 

( p ( z )) 3 = + + 15 五 6 H - . 

From these, one sees that the difference (p^z)) 2 — 4(p (之 )） 3 + 60E4p(z) + 
140^6 is holomorphic near 0, and in fact equal to 0 at the origin. Since 
this difference is also doubly periodic, we conclude by Theorem 1.2 that it 
is constant, and hence identically 0. This proves the following corollary. 

Corollary 2.3 If g 2 = 6 OE 4 and — 140 Eq ，then 

(p’) 2 二 4p 3 - 52 P- 53 - 







276 


Chapter 9. AN INTRODUCTION TO ELLIPTIC FUNCTIONS 


Note that this identity is another version of Theorem 1.7, and it al¬ 
lows one to express the symmetric functions of the in terms of the 
Eisenstein series. 


2.2 Eisenstein series and divisor functions 


We will describe now the link between Eisenstein series and some number- 
theoretic quantities. This relation comes about if we consider the Fourier 
coefficients in the Fourier expansion of the periodic function (r). Equiv¬ 
alently, we can write £(z) = E^{r) with 之 =e 2?rzr , and investigate the 
Laurent expansion of 5 as a function of 2 ：. 


We begin with a lemma. 

Lemma 2.4 If k >2 and Im(r) > 0 7 then 


E 


1 


_ (-2ttz) a 
(n + T) fc (fc — 1)! 


i=i 




Proof. This identity follows from applying the Poisson summation 
formula to f(z) = 1/(2 ： + r) fc ; see Exercise 7 in Chapter 4. 

An alternate proof consists of noting that it first suffices to establish 
the formula for fc = 2, since the other cases are then obtained by differ¬ 
entiating term by term. To prove this special case, we differentiate the 
formula for the cotangent derived in Chapter 5 


oo 

E 


1 

n + t 


= 7T COt 7TT. 


This yields 


E 


(n + T) 2 sin 2 (7TT) 


Now use Euler’s formula for the sine and the fact that 


oo 

rw r = 

r=l 


W 


(1 — w) 2 


with w = e 2 冒 


to obtain the desired result. 

As a consequence of this lemma, we can draw a connection between 
the Eisenstein series, the zeta function, and the divisor functions. The 
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divisor function cr^(r) that arises here is defined as the sum of the £ th 
powers of the divisors of r, that is, 

崎 ) =XX. 

d\r 


Theorem 2.5 // fc > 4 is even, and Im(r) > 0, then 


E k ( 丁 ） = 2C(fc) + 


2(-1产/2(2 丌产 

(fc-1)! 


J2^-i(r)e 2 ^ iTr . 


Proof. First observe that ak~i(r) < rr fc_1 = r k . If Im(r) = then 
whenever t> to we have |e 27rzrr | < e -27rrt °, and we see that the series in 
the theorem is absolutely convergent in any half-plane t > to, by compar¬ 
ison with r k e~ 2nrto . To establish the formula, we use the definition 

of Ek, that of the fact that k is even, and the previous lemma (with r 
replaced by mr) to get successively 


及⑺ 


E 


(n,m)#(0,0) 


(n + mr) k 




i 


n^O 


m^O n=—oo 
oo 


(n + mr) h 


2 cw + E E 


m^O n=—oo 


{n + mr) k 


2C(fc) + 2j] ^ 


1 


m>0 n=—oo 


(n + mr) h 


2C(A;) + 2E 


(~2ni) k 


m>o 








2C(fc) 


(k-iy. 

2( — 1 产/2(2丌) 

(fc-1)! 


ik—1 ^.Tvirmi 


m>0 i=l 
k 00 


■^cr fe -i(r)e 


27rzrr 


This proves the desired formula. 

Finally, we turn to the forbidden case k = 2. The series we have in 
mind X^( n m)^(o o) l/( n + m 丁 ) 2 no longer converges absolutely, but we 
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seek to give it a meaning anyway. We define 


F(r) 




EE 

m \ n 


(n + TUT ) 2 


summed in the indicated order with (n, m) ^ (0,0). The argument given 
in the above theorem proves that the double sum converges, and in fact 
has the expected expression. 


Corollary 2.6 The double sum defining F converges in the indicated 
order. We have 


F(r)^2a2)-8n 2 Y / a(r)e 2 ^ r \ 

r=l 

where a(r) = ^2 d \ r d is the sum of the divisors of r. 

It can be seen that F(—1/t)t _2 does not equal F(t), and this is the 
same as saying that the double series for F gives a different value (F ， 
the reverse of F) when we sum first in m and then in n. It turns out 
that nevertheless the forbidden Eisenstein series F(r) can be used in 
a crucial way in the proof of the celebrated theorem about representing 
an integer as the sum of four squares. We turn to these matters in the 
next chapter. 

3 Exercises 

1. Suppose that a meromorphic function / has two periods lo\ and o ； 2 , with 
u ： 2/oj\ G M . 

(a) Suppose is rational, say equal to p/q, where p and q are relatively 

prime integers. Prove that as a result the periodicity assumption is equiva¬ 
lent to the assumption that / is periodic with the simple period too = ^a ； i. 
[Hint: Since p and q are relatively prime, there exist integers m and n such 
that mq np = 1 (Corollary 1.3, Chapter 8, Book I).] 

(b) If UJ 2 / 0 J 1 is irrational, then / is constant. To prove this, use the fact that 
{m — nr} is dense in R whenever r is irrational and m, n range over the 
integers. 


2. Suppose that a±,... ,a r and bi, … ， b r are the zeros and poles, respectively, in 
the fundamental parallelogram of an elliptic function /. Show that 


d\ ~h ' ' ' ~h dr — — • . • — br = TiUJl ~h 77002 
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for some integers n and m. 

[Hint: If the boundary of the parallelogram contains no zeros or poles, simply inte¬ 
grate zf\z)/f(z) over that boundary, and observe that the integral of f (z)lj(z) 
over a side is an integer multiple of 2ni. If there are zeros or poles on the side 
of the parallelogram, translate it by a small amount to reduce the problem to the 
first case.] 

3. In contrast with the result in Lemma 1.5, prove that the series 

---——— where r G H 

\n + mr r 

n+mreA* 1 1 

does not converge. In fact, show that 

l/(n 2 + m 2 ) = 2n log R + 0(1) as — oo. 

l<n 2 -\-m 2 <R 2 


4. By rearranging the series 

1 V [ 1 1 

^ (z + u) 2 ~ ^2 ' 

show directly, without differentiation, that p(z -\- uj) = p{z) whenever a; G A. 

[Hint: For R sufficiently large, note that p(z) = p R {z) + 0(1/R), where 

p R (z) = z~ 2 + + ^) -2 — ci； -2 ). Next, observe that both 

+ 1) - P R (z) and p R {z + r) - p R (z) are 0 (Ek- c <| w |<h+c l w l— 2 ) = °( 1 /- R )-] 

5. Let cr(z) be the canonical product 

oo 

a(z) = zY[E 2 (z/rj), 

where Tj is an enumeration of the periods {n + mr} with (n, m) ^ (0,0), and 
E 2 (z) = {l-z)e z+z2 / 2 . 

(a) Show that cr(z) is an entire function of order 2 that has simple zeros at all 
the periods n + mr, and vanishes nowhere else. 

(b) Show that 


a ⑷ 


4+ E 

(n,m)^(0,0) 



z — n — mr n + mr (n + mr) 2 


and that this series converges whenever z is not a lattice point. 












280 


Chapter 9. AN INTRODUCTION TO ELLIPTIC FUNCTIONS 


(c) Let L(z) = —a (z)/a(z). Then 


• = 吣 ). 


6. Prove that p" is a quadratic polynomial in p. 

7 . Setting r = 1/2 in the expression 

E 1 ^ 


(m + r) 2 sin 2 (7TT)， 


deduce that 


E 


m 2 8 


and 


E 


m>l, m odd 

Similarly, using l/(m + t ) 4 deduce that 


i>i 


m 2 6 


c ⑶. 


E 

a>l, m odd 


^ = ^6 and ^ 


a>l 


m 4 90 


c ⑷. 


These results were already obtained using Fourier series in the exercises at the 
end of Chapters 2 and 3 in Book I. 


8. Let 


丁 ） = yz 


(n,m)^(0,0) 

be the Eisenstein series of order 4. 

(a) Show that E^r) 7r 4 /45 as Im(r) ^ 00 . 

(b) More precisely, 


(n + mr) 4 


^^-45 


< ce~ 


\i r = x -\- it and t > 1. 


(c) Deduce that 


E4r) 


— T 


45 


< ct~^e 


-27r/t 


if t = it and 0 < t < 1. 
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4 Problems 


1 . Besides the approach in Section 1 . 2 , there are several alternate ways of deal¬ 
ing with the sum E 1 /(^ + cj) 2 , where lo = n-\- mr. For example, one may sum 
either (a) circularly, (b) first in n then in m, (c) or first in m then in n. 

(a) Prove that if 2 穿八 ， then 


lim 

R—*oo 


E 

n 2 +m 2 <R 2 


1 

(z + n + mr) 2 


= 汾 （：) 


exists and 5^(2：) = p(z) + ci. 

(b) Similarly, 

^ (z + n + mr) 2 ) = S2 ^ 

exists and 6*2(2) = p(z) + C2, where C2 = F(r), and F is the forbidden Eisen- 
stein series. 

(c) Also 

(z + n + mr^) = & ㈤ 

exists with Ss(z) = p(z) + C3, and C3 = F(r), the reverse of F. 

[Hint: To prove (a), it suffices to show that lim_R— 00, l/(n + mr ) 2 = c\ 

l<n 2 -\-m 2 <R 2 

exists. This is proved by a comparision with J 1<x 2 +y 2 <R 2 ( x ^y T y^ = J(^). It can 
be shown that I(R) = 0 , which follows because (x + yr)~ 2 = —(d/dx)(x + yr) -1 .] 


2 . Show that 

00 1 

p(z) = c+^ Y： s ^ {{z+mT)n) 

m= — oo \ \ 7 , 

where c is an appropriate constant. In fact, by part (b) of the previous problem 
c = —F{t). 


3 .* Suppose Dis a simply connected domain that excludes the three roots of the 
polynomial 4z 3 — g2Z — g^. For a；o G Q and ujq fixed, define the function 7 on by 

Tf 、 「 dz 0 

I(to) = — 1 a; G H. 

J^o V 4:2:3 - 92 Z - g3 

Then the function I has an inverse given by p(z + a) for some constant a; that is, 


I(p(z + a)) = z 
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for appropriate a. 

[Hint: Prove that (I(p(z + a)))’ = 士 1, and use the fact that p is even.] 

4.* Suppose r is purely imaginary, say 丁 = it with t > 0. Consider the division 
of the complex plane into congruent rectangles obtained by considering the lines 
x = n!2, y = tm/2 as n and m range over the integers. (An example is the rect¬ 
angle whose vertices are 0,1/2,1/2 + r/2, and r/2.) 

(a) Show that p is real-valued on all these lines, and hence on the boundaries 
of all these rectangles. 

(b) Prove that p maps the interior of each rectangle conformally to the upper 
(or lower) half-plane. 


Applications of Theta 
Functions 


The problem of the representation of an integer n as 
the sum of a given number k of integral squares is one 
of the most celebrated in the theory of numbers. Its 
history may be traced back to Diophantus, but begins 
effectively with Girard’s (or Fermat’s) theorem that a 
prime 4m + 1 is the sum of two squares. Almost every 
arithmetician of note since Fermat has contributed to 
the solution of the problem, and it has its puzzles for 
us still. 

G. H. Hardy, 1940 


This chapter is devoted to a closer look at the theory of theta functions 
and some of its applications to combinatorics and number theory. 

The theta function is given by the series 


0 (^| r ) = e win2T e 2ninz , 


which converges for all 2 ： G C, and r in the upper half-plane. 

A remarkable feature of the theta function is its dual nature. When 
viewed as a function of we see it in the arena of elliptic functions, since 
0 is periodic with period 1 and “quasi-period” t. When considered as 
a function of r, 0 reveals its modular nature and close connection with 
the partition function and the problem of representation of integers as 
sums of squares. 

The two main tools allowing us to exploit these links are the triple¬ 
product for 0 and its transformation law. Once we have proved these 
theorems, we give a brief introduction to the connection with partitions, 
and then pass to proofs of the celebrated theorems about representation 
of integers as sums of two or four squares. 
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1 Product formula for the Jacobi theta function 

In its most elaborate form, Jacobi’s theta function is defined for z E C 
and r G H by 


⑴ ©(♦) 

二 1 e 
n=—oo 

t ^lirinz 

Two significant special cases 

(or variants) 

are 9{r) and which are 

defined by 



( K 丁)二 

71 

oo 

\ 、 7rm 2 r 

!. = —OO 

r G H , 

_ 二 

OO 

t > 0. 


n=—oo 


In fact, the relation between these various functions is given by 
9{t) = 0(O|r) and 办 (t) = 0(it), with of course, t > 0. 

We have already encountered these functions several times. For exam¬ 
ple, in the study of the heat diffusion equation for the circle, in Chapter 4 
of Book I, we found that the heat kernel was given by 

oo 

H t (x)^ e~^ nH e 27tinx , 

n=—oo 

and therefore H t (x) = 0(:r|47rit). 

Another instance was the occurence of ^ in the study of the zeta func¬ 
tion. In fact, we proved in Chapter 6 that the functional equation of ^ 
implied that of C, which then led to the analytic continuation of the zeta 
function. 

We begin our closer look at ㊀ as a function of z, with r fixed, by 
recording its basic structural properties, which to a large extent charac¬ 
terize it. 

Proposition 1.1 The function 0 satisfies the following properties: 

(i) 0 is entire in z E ： C and holomorphic m r G H. 

(ii) Q(z + l|r) = Q(z\r). 

(iii) ㊀ (之 + r|r) = 0(^|r)e _7riT e _27ri2： . 

(iv) Q(z\r) = 0 whenever z = 1/2 + r/2 + n + mr and n^m E Z. 
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Proof. Suppose that Im(r) = t > to > 0 and z = x + iy belongs to a 
bounded set in C, say |z| < M. Then, the series defining 0 is absolutely 
and uniformly convergent, since 


E 




^TTITIZ I 


< 


n>0 


0 27rnAf 


< OO. 


Therefore, for each fixed r G H the function 0 (.|t) is entire, and for each 
fixed z G C the function ㊀ ( 之 |.) is holomorphic in the upper half-plane. 

Since the exponential e 27rmz is periodic of period 1, property (ii) is 
immediate from the definition of ㊀. 

To show the third property we may complete the squares in the ex¬ 
pression for Q(z + r|r). In detail, we have 

OO 

Q(z + r|r)= E ^nin 2 r ^27rin(z-\-T) 

n=—oo 


〉 : ^ni(n 2 -\-2n)T^2ninz 


J2 e— + 1 )% —TTiT^irinz 


〉 : ^7vi(n-\-l) 2 r^—nir^27ri(n-\-l)z^—27 tiz 


e ( z \ T ) e~ mT e 


7TZT 27T22： 


Thus we see that 0(^|r), as a function of z, is periodic with period 1 and 
“quasi-periodic” with period r. 

To establish the last property it suffices, by what was just shown, to 
prove that 0(1/2 + r/2|r) = 0. Again, we use the interplay between n 
and n 2 to get 

oo 

0(1/2+ r/2[r) = E g7rm 2 Tg27rm(l/2+T/2) 

n=—oo 
oo 

_ > : (_i)n e 7ri(n 2 +n)T 


To see that this last sum is identically zero, it suffices to match n > 0 
with —n — 1, and to observe that they have opposite parity, and that 
(_n — 1) 2 + (—n — 1) = n 2 + n. This completes the proof of the propo¬ 
sition. 
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We consider next a product U(z\r) that enjoys the same structural 
properties as Q(z\r) as a function of 2 :. This product is defined for 2 : G C 
and r G H by 


rid 1 2 n )(l + q 2n - 1 e 2wiz )(l + g 2n - 1 e~ 2 ^), 

n=l 

where we have used the notation that is standard in the subject, namely 
q = e KlT . The function U(z\r) is sometimes referred to as the triple¬ 
product. 

Proposition 1.2 The function Yl(z\r) satisfies the following properties: 

(i) n(z,r) is entire in z E C and holomorphic for r G H. 

(ii) U(z + l|r) = U(z\t). 

(iii) n(z + r|r) = II ㈤ T)e -7rfr e -2 气 

(iv) n(z|r) = 0 whenever z = 1/2 + r/2 + n + mr and n^m ^ Z. More¬ 
over, these points are simple zeros of and II(-|r) has no 

other zeros. 

Proof. If Im(r) = t > t 0 > 0 and z = x + iy, then |^| < e~ nt ° < 1 and 
(1 - g 2n )(l+ q 2n - 1 e 2 ^ iZ ){l+ ^2n-l e -2^^ = 1 + 0 (| g |2n-l e 2*|). 

Since the series |g| 2n_1 converges, the results for infinite products in 
Chapter 5 guarantee that U(z\r) defines an entire function of z with 
t G H fixed, and a holomorphic function for r G H with z ^ C fixed. 

Also, it is clear from the definition that U(z\r) is periodic of period 1 
in the 2 : variable. 

To prove the third property, we first observe that since q 2 = e 27rlr we 
have 


U{z + t\t) = n) 1 — <l 2n )^ l + g 2n_1 e 2 " (2+T) )(l + 产 -Y 2 响 +”） 

n=l 

00 

=JJ(1 — q 2n ){l + q 2n+1 e 2niz )(l + q 2n ~ 3 e- 2niz ). 

n=l 

Comparing this last product with II(z|t), and isolating the factors that 
are either missing or extra leads to 


11(2 ： + r|r) = H(z\t) 


1 + qe 2nlz 
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Hence (iii) follows because (1 + x)/(l + x _1 ) = x, whenever x ^ —1. 

Finally, to find the zeros of U(z\r) we recall that a product that con¬ 
verges vanishes only if at least one of its factors is zero. Clearly, the factor 
(1 — q n ) never vanishes since |g| < 1. The second factor 
(1 + g2n-i e 27ri2) vanishes when q 2n ~ 1 e 27T，lz = —1 = e^. Since q = e nlT , 
we then have 1 

(2n — l)r + 2^=1 (mod 2). 

Hence, 

2 ： = 1/2 + r/2 — nr (mod 1), 

and this takes care of the zeros of the type l/2 + r/2 — nr + m with 
n > 1 and m G Z. Similarly, the third factor vanishes if 

(2n — l)r — 2z = 1 (mod 2) 

which implies that 

z = —1/2 — r/2 + nr (mod 1) 

=1/2 + r/2 + n'r (mod 1 )， 

where n / > 0. This exhausts the zeros of Finally, these zeros are 

simple, since the function e w — 1 vanishes at the origin to order 1 (a fact 
obvious from a power series expansion or a simple differentiation). 

The importance of the product n comes from the following theorem, 
called the product formula for the theta function. The fact that Q(z\r) 
and U(z\r) satisfy similar properties hints at a close connection between 
the two. This is indeed the case. 

Theorem 1.3 (Product formula) For all z ^ C and r we have 
the identity Q(z\r) = U(z\r). 

Proof. Fix r G H. We claim first that there exists a constant c(r) 
such that 

(2) 0(z|t) = c(r)n(z|r). 

In fact, consider the quotient F(z) = 0(2 ： |r)/II(^|r), and note that by the 
previous two propositions, the function F is entire and doubly periodic 
with periods 1 and r. This implies that F is constant as claimed. 


x We use the standard short-hand, a = b (mod c), to mean that a — 6 is an integral 
multiple of c. 



288 


Chapter 10. APPLICATIONS OF THETA FUNCTIONS 


We must now prove that c(r) = 1 for all r, and the main point is to 
establish that c(r) = c(4r). If we put z = 1/2 in (2), so that e 2llTZ = 
e -2z7rz _ —i， we obtain 


e (-i)v 2 = 办 ) n(i - ? 2n )(i - ? 2n_i )(i - g 2n_i ) 

n=—oo n=l 


oo 

崎 ) n [(1 - (1- q 2n ~ l ) 

n=l 

oo 


= c (T)n(ii n )(i-? 2n_i ). 

n=l 


Hence 


(3) 

Next, we put 2 := 


( 、 ！ T = -oo(-i) n 〆 

1/4 in (2), so that e 2lnz = i. On the one hand, we have 


0(l/4|r )= 乙 q n \' 

n=—00 


and due to the fact that 1/i = —z, only the terms corresponding to n = 
even = 2m are not cancelled; thus 


e ( i / 4 | 丁) = L 〆 (—『• 

m =—00 

On the other hand, 

00 

n ( l / 4 | r ) = JJ(1- q 2m ){l + iq 2m - x ){l» iq 2m ~ x ) 

m=l 

00 

=n(ii 2m )(i+g 4m - 2 ) 

m=l 

00 

= J](i -产 )(i- 产 - 4 ), 

n=l 


where the last line is obtained by considering separately the two cases 
2m = 4n — 4 and 2m = 4n — 2 in the first factor. Hence 


⑷ 


C(T)= 


E=-oo(-i)V" 2 

rc = 1 (iD(i-d 
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and combining (3) and (4) establishes our claim that c(r) = c(4r). Suc¬ 
cessive applications of this identity give c(t) = c(4 fc r), and since q 4k = 
e Z7r4 T —• 0 as /c —^ oo, we conclude from (2) that c(r) = 1. This proves 
the theorem. 


The product formula for the function 0 specializes to its variant 6{r )= 
0(O|r), and this provides a proof that 6 is non-vanishing in the upper 
half-plane. 

Corollary 1.4 //Im(r) > 0 and q = e KlT , then 

oo 


Thus 6 {t) 7 ^ 0 for r G H. 

The next corollary shows that the properties of the function 0 now 
yield the construction of an elliptic function (which is in fact closely 
related to the Weierstrass p function). 

Corollary 1.5 For each fixed r G the quotient 


(loge^lr))' 


0(z|t)0"(z|t)— (㊀ ’(2 ： |t)) 2 
Q(z\t ) 2 


is an elliptic function of order 2 with periods 1 and r, and with a double 
pole at z = 1/2 + r/2. 


In the above, the primes ' denote differentiation with respect to the 2 ： 
variable. 


Proof. Let i^(z) = (log0(2 ： |T))’= ㊀ (2 ： |t)’/ ㊀ (z|t). Differentiating 
the identities (ii) and (iii) of Proposition 1.1 gives F(z + 1) = F(z), 
F(z + t) = F(z) — 27tz, and differentiating again shows that F\z) is dou¬ 
bly periodic. Since Q(z\r) vanishes only at z = 1/2 + r/2 in the funda¬ 
mental parallelogram, the function F(z) has only a single pole, and thus 
F’(z) has only a double pole there. 


The precise connection between (log0(2 ： |r)) // and p T (z) is stated in 
Exercise 1. 

For an analogy between 0 and the Weierstrass a function, see Exer¬ 
cise 5 of the previous chapter. 


1.1 Further transformation laws 

We now come to the study of the transformation relations in the r- 
variable, that is, to the modular character of 0. 
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Recall that in the previous chapter, the modular character of the 
Weierstrass p function and Eisenstein series Ek was reflected by the 
two transformations 


t h 了 + 1 and r i—>• — 1/r, 

which preserve the upper half-plane. In what follows, we shall denote 
these two transformations by and 5, respectively. 

When looking at the ㊀ function, however, it will be natural to consider 
instead the transformations 


T 2 : r i—r + 2 and 5 : r 1 —> — 1 /r, 

since Q(z\r + 2) = 0(z|r), but Q(z\r + 1) — Q(z\r). 

Our first task is to study the transformation of Q(z\r) under the map¬ 
ping r 1 ——1/r. 

Theorem 1.6 // r G then 

(5) 0(z| — 1/r) = for all z E C. 

Here \frji denotes the branch of the square root defined on the upper 
half-plane, that is positive when 丁 = it, t > 0. 

Proof. It suffices to prove this formula for z = x real and r = it 
with t > 0, since for each fixed x G M, the two sides of equation (5) are 
holomorphic functions in the upper half-plane which then agree on the 
positive imaginary axis, and hence must be equal everywhere. Also, for 
a fixed r G H the two sides define holomorphic functions in 2 ： that agree 
on the real axis, and hence must be equal everywhere. 

With x real and 丁 = it the formula becomes 


00 

E 


-7vn 2 /t ^27rinx 




E 


rvn 2 t ^—2ivnxt 


Replacing x by a, we find that we must prove 


E 


3 —7rt(n+a) 2 


E 


-1/2^—7rn 2 /tg27rma 


However, this is precisely equation (3) in Chapter 4, which was derived 
from the Poisson summation formula. 

In particular, by setting 2 ： = 0 in the theorem, we find the following. 
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Corollary 1.7 //Im(T) > 0 ， then 0(—l/r) = \frji 0(r). 

Note that if r = it, then 6(r) = and the above relation is precisely 
the functional equation for 汐 which appeared in Chapter 4. 

The transformation law 0(—1/r) = {j gives us very precise 
information about the behavior when r —»• 0. The next corollary will be 
used later, when we need to analyze the behavior of 6{r) as r —> 1. 

Corollary 1.8 // r G then 

0(1 — 1/t)= ^ E e 7rz(n+l/2) 2 r 
v n=—oo 

4(2 ， 〆+ …). 

The second identity means that 6(1 — 1/r) ~ -y/r/z2e 27rT / 4 as 
Im(r) —^ oo. 

Proof. First, we note that n and n 2 have the same parity, so 

oo 

0(1 +T) 二 (-l)V™ 2r = 0(l/2|r), 

n=—oo 

hence 0(1 — 1/r)= ㊀ (1/2| — 1/r). Next, we use Theorem 1.6 with z = 
1/2, and the result is 


0(1- 1/r) - yi e -W4© (r/ 2| r ) 




e 7ri(n+l/2) 2 r 


The terms corresponding to n = 0 and n = —1 contribute 2e 7rZT//4 , which 
has absolute value 2e _7r " 4 where 丁 = a it. Finally, the sum of the 
other terms n _ 0, — 1 is of order 


O 


0 -(fc + l/2) 2 7Tt 


二 O (e- 9nt/4 ). 
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Our final corollary of the transformation law pertains to the Dedekind 
eta function, which is defined for Im(r) > 0 by 

"( T ) 二 e 普 JJ(l-e 2 ™ r ). 

n=l 

The functional equation for 77 given below will be relevant to our discus¬ 
sion of the four-square theorem, and in the theory of partitions. 

Proposition 1.9 //Im(r) > 0 ； then r/(— 1/t) = -\/r/z ry(r). 

This identity is deduced by differentiating the relation in Theorem 1.6 
and evaluating it at 卻 =1/2 + r/2. The details are as follows. 

Proof. From the product formula for the theta function, we may write 
with q = e niT , 

00 

0(2|t) = (1 + qe~ 2niz ) JJ(1- q 2n ){l + g 2rt-1 e 27H2 )(l + q 2 n+ 1 e~ 2niz ), 

n=l 

and since the first factor vanishes at zq = 1/2 + r/ 2 , we see that 

O\zo\r) = 27riH ( 丁 ) , where H [ 丁） = ]^[^ =1 (1 — e 2?rmr ) 3 . 

Next, we observe that with —1/r replaced by t in (5), we obtain 

㊀ (:|t) = 謂 2/V ㊀ (-2 ： /t| — 1/r). 

If we differentiate this expression and then evaluate it at the point Zo = 
1/2 + r/ 2 , we find 

2niH{j ) 二 y/i/re~^e~^e~ 2 ^ L H {~ l / T )- 

Hence 

/ .\ 3/2 

We note that when r = it, with ^ > 0, the function rj(r) is positive, and 
thus taking the cube root of the above gives rj ( 丁） = y/ijr r/(— 1 /r); there¬ 
fore this identity holds for all r G H by analytic continuation. 

A connection between the function ry and the theory of elliptic func¬ 
tions is given in Problem 5. 
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2 Generating functions 

Given a sequence {F n }^L 0 , which may arise either combinatorially, re¬ 
cursively, or in terms of some number-theoretic law, an important tool 
in its study is the passage to its generating function, defined by 

oo 

F{x)^^F n x n . 

n=0 

Often times, the defining properties of the sequence {F n } imply interest¬ 
ing algebraic or analytic properties of the function F(x), and exploiting 
these can eventually lead us back to new insights about the sequence 
{_F n }. A very simple-minded example is given by the Fibonacci sequence. 
(See Exercise 2). Here we want to study less elementary examples of this 
idea, related to the ㊀ function. 

We shall first discuss very briefly the theory of partitions. 

The partition function is defined as follows: if n is a positive integer, 
we let pin) denote the numbers of ways n can be written as a sum of 
positive integers. For instance, p(l) = 1, and p(2) = 2 since 2 = 2 + 0 = 
1 + 1. Also, p(3) = 3 since 3 = 3 + 0 = 2+ l = l + l + l. We set p(0) = 
1 and collect some further values of p(n) in the following table. 


n 

0 

1 

2 

3 

4 

5 

6 

7 

8 


12 

P(n) 

1 

1 

2 

3 

5 

7 

11 

15 

22 


77 


The first theorem is Euler’s identity for the generating function of the 
partition sequence (p(n)}, which is reminiscent of the product formula 
for the zeta function. 


Theorem 2.1 If \x\ < 1 ， then 〉 ^p(n)x n 

n=0 

Formally, we can write each fraction as 


=n 

k=l 


— X k 



oo 

- x km 

m=0 


and multiply these out together to obtain p{n) as the coefficient of x n . 
Indeed, when we group together equal integers in a partition of n, this 
partition can be written as 


n = rriiki + ... + m r k ri 
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where fci,..., are distinct positive integers. This partition corresponds 
to the term 

(x kl ) mi ■ ■ ■ (x K ) mr 


that arises in the product. 

The justification of this formal argument proceeds as in the proof of 
the product formula for the zeta function (Section 1, Chapter 7); this is 
based on the convergence of the product V(1 — xk ). This convergence 
in turn follows from the fact that for each fixed Ixl < 1 one has 



= 1 + 0 (^). 


A similar argument shows that the product 1/(1 — x 2n_1 ) is equal to 
the generating function for p G (n), the number of partitions of n into odd 
parts. Also, na + x n ) is the generating function for p u (n), the number 
of partitions of n into unequal parts. Remarkably, p G (n) = p u (n) for all 
n, and this translates into the identity 

OO / \ oo 

n(i^r)-n( i +^ 

n=l \ / n=l 

To prove this note that (1 + x n )(l — x n ) = 1 — x 2n , and therefore 

oo oo oo 

[]( 1 +^)[]( 1 -^)- 11 ( 1 - x2n )- 
n=l n=l n=l 

Moreover, taking into account the parity of integers, we have 

oo oo oo 

n(i - w) n。 - ^ - 1 ) 二取 -，)， 

n=l n=l n=l 

which combined with the above proves the desired identity. 

The proposition that follows is deeper, and in fact involves the 0 func¬ 
tion directly. Let p e , u (n) denote the number of partitions of n into an 
even number of unequal parts, and p 0 ,u{. n ) the number of partitions of n 
into an odd number of unequal parts. Then, Euler proved that, unless n is 
a pentagonal number, one has p e , u (n) = p 0 ,u{ n )- By definition, pentag¬ 
onal numbers 2 are integers n of the form k(3k + 1)/2, with fc G Z. For 


2 The traditional definition is as follows. Integers of the form n = k(k — 1)/2, A: E Z, 
are “triangular numbers” ； those of the form n = k 2 are “squares” ； and those of the form 
k(3k + 1)/2 are “pentagonal numbers.” In general, numbers of the form (k/2)((£ — 2)k + 
€ — 4) are associated with an £-sided polygon. 
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example, the first few pentagonal numbers are 1 ， 2,5, 7,12,15,22,26,.... 
In fact, if n is pentagonal, then 

Pe,u(n) - p 0 , u (n) = (—l) k , if n= k(3k-h 1)/2. 


To prove this result, we first observe that 

oo oo 

JJ(1 - x n ) = ^2\p e ,u( n ) - Po,u( n )] xTl - 

n=l n=l 

This follows since multiplying the terms in the product, we obtain terms 
of the form (—l) r x ni_l ^ nr where the integers ni,... ,n r are distinct. 
Hence in the coefficient of x n , each partition ni + • • • + n r of n into an 
even number of unequal parts contributes for +1 (r is even), and each 
partition into an odd number of unequal parts contributes —1 (r is odd). 
This gives precisely the coefficient p e , u (n) — p 0 , u (n). 

With the above identity, we see that Euler’s theorem is a consequence 
of the following proposition. 

oo oo 

Proposition 2.2 — x n ) = (—l) fc x ( 2+ ). 

n=l k=—oo 

Proof. If we set x = e 27rlu , then we can write 

oo oo 

JJ(1 -a: n ) = JJ(l-e 2 ™ u ) 

n=l n=l 

in terms of the triple product 

oo 

JJ(1- q 2n )(l + q 2n ~ 1 e 27riz )(l + q 加 S- 2 *) 

n=l 


by letting q = e 3nlu and z = 1/2 + u/2. This is because 

oo oo 

11(1 — g27ri3nit)(i — ^2ni(3n—l)u^ ^ — ^27ri(3n—2)u^ — rid- e 

n=l n=l 

By Theorem 1.3 the product equals 

oo oo 

〉 : ^3irin 2 u ^ — ^Zninu/2 _ 〉: ( — ^Trin(3n-\-l)u 
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= ^2 (-i) n x n(3n+i)/2 , 

n=—oo 


which was to be proved. 

We make a final comment about the partition function p{n). The 
nature of its growth as n —>• oc can be analyzed in terms of the behavior 
ofi/aT=i(i — x) n as \x\ —>■ 1. In fact, by elementary considerations, we 
can get a rough order of growth of p(n) from the growth of the generating 
function as x —> 1; see Exercises 5 and 6. A more refined analysis requires 
the transformation properties of the generating function which goes back 
to the corresponding Proposition 1.9 about rj. This leads to a very good 
asymptotic formula for p(n). It may be found in Appendix A. 

3 The theorems about sums of squares 

The ancient Greeks were fascinated by triples of integers (a,b ， c) that 
occurred as sides of right triangles. These are the “Pythagorean triples,” 
which satisfy a 2 + b 2 = c 2 . According to Diophantus of Alexandria 
(ca. 250 AD), if c is an integer of the above kind, and a and b have 
no common factors (a case to which one may easily reduce), then c is the 
sum of two squares, that is, c = m 2 + n 2 with m, n G Z; and conversely, 
any such c arises as the hypotenuse of a triangle whose sides are given by 
a Pythagorean triple (a,6, c). (See Exercise 8.) Therefore, it is natural 
to ask the following question: which integers can be written as the sum 
of two squares? It is easy to see that no number of the form 4fc + 3 can 
be so written, but to determine which integers can be expressed in this 
way is not obvious. 

Let us pose the question in a more quantitative form. We define r* 2 (n) 
to be the number of ways n can be written as the sum of two squares, 
counting obvious repetitions; that is, r* 2 (n) is the number of pairs (x ， y), 
a:, y G Z, so that 

n = x 2 + y 2 . 

For example, ”2(3) = 0, but r 2 (5) = 8 because 5 = (士 2) 2 + (士 l) 2 , and 
also 5 = (士 l) 2 + (士 2) 2 . Hence, our first problem can be posed as follows: 

Sum of two squares: Which integers can be written as a 
sum of two squares? More precisely, can one determine an 
expression for r* 2 (n)? 

Next, since not every positive integer can be expressed as the sum of 
two squares, we may ask if three squares, or possibly four squares suffice. 
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However, the fact is that there are infinitely many integers that cannot 
be written as the sum of three squares, since it is easy to check that no 
integer of the form 8 fc + 7 can be so written. So we turn to the question 
of four squares and define, in analogy with V 2 {n ), the function r^{n) to be 
the number of ways of expressing n as a sum of four squares. Therefore, 
a second problem that arises is: 

Sum of four squares: Can every positive integer be written 
as a sum of four squares? More precisely, determine a formula 
for 7*4 (n). 

It turns out that the problems of two squares and four squares, which 
go back to the third century, were not resolved until about 1500 years 
later, and their full solution was first given by the use of Jacobi’s theory 
of theta functions! 

3.1 The two-squares theorem 

The problem of representing an integer as the sum of two squares, while 
obviously additive in nature, has a nice multiplicative aspect: if n and 
m are two integers that can be written as the sum of two squares, then 
so can their product nm. Indeed, suppose n = a 2 + 6 2 , m = c 2 + d 2 , and 
consider the complex number 

x iy = (a-\- ib)(c + id). 

Clearly, x and y are integers since a, 6 , c, d G Z, and by taking absolute 
values on both sides we see that 

x 2 -\-y 2 = (a 2 + 6 2 )(c 2 + d 2 ), 
and it follows that nm = x 2 -\- y 2 . 

For these reasons the divisibility properties of n play a crucial role in 
determining r 2 (n). To state the basic result we define two new divisor 
functions: we let diin) denote the number of divisors of n of the form 
Ak + 1, and d^(n) the number of divisors of n of the form 4fc + 3. The 
main result of this section provides a complete answer to the two-squares 
problem: 

Theorem 3.1 If n> 1 7 then r* 2 (n) = 4(di(n) — (n)). 

A direct consequence of the above formula for 7*2 (n) may be stated as 
follows. If n = p^ 1 - - -p^ r is the prime factorization of n where Pu ... ,p r 
are distinct, then: 
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The positive integer n can be represented as the sum of two 
squares if and only if every prime pj of the form 4/c + 3 that 
occurs in the factorization of n has an even exponent aj. 

The proof of this deduction is outlined in Exercise 9. 

To prove the theorem, we first establish a crucial relationship that 
identifies the generating function of the sequence {r 2 (n)}^ 1 with the 
square of the 9 function, namely 


⑹ 以丁 ) 2 二 jy 2 (n)q' 

n=0 

whenever q = e nlT with r G H. The proof of this identity relies simply on 
the definition of r *2 and 6. Indeed, if we first recall that 6{r) = Yl°^oo ^ ^ 
then we obtain 

/ oo \ / oo 

w ) 2 - E ^ E ^ 

\ni=—oo / \n2 =—oo 

(ni ,ri 2 )GZxZ 
oo 

n =0 

since r 2 (n) counts the number of pairs (ni, 7 ^ 2 ) with nf + = n. 

Proposition 3.2 The identity r 2 (n) = 4(di(n) — d^(n)), n > 1 7 is equiv¬ 
alent to the identities 

⑺ 0(r) 2 = 2 f ； = l + 

w v ' 乙 q n + q_ n ^l + q 2n 

n=—00 n=l 

whenever q = e 7TlT and r G H. 

Proof. We note first that both series converge absolutely since |^| < 1, 
and the first equals the second, because l/(q n + q~ n ) = gl n l/(l + q 2 ^). 

Since (1 + g 2n ) - 1 = (1 — q 2n )/{l — q 4n ), the right-hand side of (7) equals 

00 / n n 

i + 4 y —— 
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However, since 1/(1 — q 4n ) = q 4nrn , we have 

OO n oo oo oo 

E -EE 严 +1 ) 二 E d i ㈦ 疒， 

n=l n=l m=0 k=l 


because d\{k) counts the number of divisors of k that are of the form 
4m + 1. Observe that the series d\{k)q k converges since d\{k) < k. 

A similar argument shows that 

oo o„ oo 

Ei^ = E 樣， 

n=l k=l 

and the proof of the proposition is complete. 

In effect, we see that the identity (6) links the original problem in 
arithmetic with the problem in complex analysis of establishing the re¬ 
lation ⑺. 

We shall now find it convenient to use C(r) to denote 3 


⑻ 


C(r)-2 ^ 


q n + q~ 


y — 1 -— . 

COS(n7TT) ’ 

i=—oo 


where q = e 7Tlr and r G H. Our work then becomes to prove the identity 
0(r) 2 =C(t). 

What is truly remarkable are the different yet parallel ways that the 

functions 9 and C arise. The genesis of the function 6 may be thought 

to be the heat diffusion equation on the real line; the corresponding 

2 

heat kernel is given in terms of the Gaussian e~ nx which is its own 
Fourier transform; and finally the transformation rule for 9 results from 
the Poisson summation formula. 

The parallel with C is that it arises from another differential equation: 
the steady-state heat equation in a strip; there, the corresponding kernel 
is 1/cosh 7tx (Section 1.3, Chapter 8), which again is its own Fourier 
transform (Example 3, Chapter 3). The transformation rule for C results, 
once again, from the Poisson summation formula. 

To prove the identity 9 2 = C we will first show that these two functions 
satisfy the same structural properties. For 6 2 we had the transformation 
law 9{t) 2 = (i/r)0(—l/r) 2 (Corollary 1.7). 


3 We denote the function by C because we are summing a series of cosines. 
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An identical transformation law holds for C(r)! Indeed, if we set a = 0 
in the relation (5) of Chapter 4 we obtain 


oo oo 

V ^= I V ^L_ 

^ cosh (7m 亡） t ^ cosh(7rn/t) 

n=—oo n=—oo 

This is precisely the identity 

C( T ) 二 (i/r)C(-l/r) 


for t = it, t > 0, which therefore also holds for all r G H by analytic 
continuation. 

It is also obvious from their definitions that both 6{r) 2 and C(r) tend 
to 1 as Im(r) —> oo. The last property we want to examine is the behavior 
of both functions at the “cusp” r = l. 4 

For 9 2 we shall invoke Corollary 1.8 to see that 6(1 — 1/r) 2 ~ 4(r/z)e 7rlT / 2 
as Im(r) —>■ oo. 

For C we can do the same, again using the Poisson summation formula. 

In fact, if we set a = 1/2 in equation (5), Chapter 4, we find 


y- (-l) n 

^ cosh(7rn/t) 


oo 


n=—oo 


1 

cosh(7r(n + l/2)t) 


Therefore, by analytic continuation we deduce that 


OO 1 

c(l — 1/ T ) = g) Z cos(7r(n+1/2)T )- 

n=—oo 

The main terms of this sum are those for n = —1 and n = 0. This easily 
gives 


C(1 - 1/t) = 4 (:) e niT/2 + O (|r| e - 3wt/2 ) , as t ^ oo, 

and where r = a it. We summarize our conclusions in a proposition. 

Proposition 3.3 The function C(r) = 1/ cos(7mT); defined in the up¬ 

per half-plane, satisfies 

(i) C(r + 2) «C(r). 

(ii) C(t) = (i/r)C(-l/r). 


4 Why we refer to the point r = 1 as a cusp, and the reason for its importance, will 
become clear later on. 
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(iii) C ( 丁 ) 1 as Im(r) —>• oo. 

(iv) C(1 — 1/t) 〜 4(r/z)e 7riT / 2 as Im(r) —> cxd. 

Moreover, 0(r) 2 satisfies the same properties. 

With this proposition, we prove the identity of 0{r) 2 = C{r) with the 
aid of the following theorem, in which we shall ultimately set f = C/6 2 . 

Theorem 3.4 Suppose f is a holomorphic function in the upper half- 
plane that satisfies: 

(i) /( t + 2) 二 f(r), 

(ii) /(-1/r) = f[T), 

(iii) f ( 丁） is bounded, 
then f is constant. 

For the proof of this theorem, we introduce the following subset of the 
closed upper half-plane, which is defined by 

= {r G H : |Re(r)| < 1 and |r| > 1}, 

and illustrated in Figure 1. 



The points corresponding to t = 士 1 are called cusps. They are equiv¬ 
alent under the mapping r i—>• r + 2. 
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Lemma 3.5 Every point in the upper half-plane can be mapped into T 
using repeatedly one or another of the following fractional linear trans¬ 
formations or their inverses: 

T 2 : r 1 —r + 2, ■ 卜 1/t. 


For this reason, T is called the fundamental domain 5 for the group of 
transformations generated by and S. 

In fact, we let G denote the group generated by and S. Since 
and S are fractional linear transformations, we may represent an element 
g E G by a matrix 

a 
c 




with the understanding that 


g(j ) 二 


ar -\-b 
cr -\- d 


Since the matrices representing T 2 and S have integer coefficients and 
determinant 1, the same is true for all matrices of elements in G. In 
particular, if r G H, then 


⑼ 


工吨⑺） = 


Im(r) 
cr + d \ 2 ' 


Proof of Lemma 3.5. Let r G H. If g E G with^(r) =(a 丁 + b ) / (cr + d\ 
then c and d are integers, and by (9) we may choose a go ^ G such that 
Im ( 分 o(t)) is maximal. Since the translations T 2 and their inverses do 
not change imaginary parts, we may apply finitely many of them to see 
that there exists gi ^ G with |Re(^i(r))| < 1 and Im(^i(r)) is maximal. 
It now suffices to prove that \gi(r)\ > 1 to conclude that gi{r) G T. If 
this were not true, that is, \gi(r)\ < 1, then lm(Sgi(r)) would be greater 
than Im(^i(r)) since 


1111(5^(T)) 二 Im(-l/5i(r))= 


Im(ffi(r)) 


> Im( 5l (r)), 


and this contradicts the maximality of Im(^i(r)). 

We can now prove the theorem. Suppose / is not constant, and let 
g(z) = f(r) where 2： = e nl 丁 . The function g is well defined for 2： in the 


5 Strictly speaking, the notion of a fundamental domain requires that every point have 
a unique representative in the domain. In the present case, ambiguity arises only for 
points that are on the boundary of T. 
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punctured unit disc, since / is periodic of period 2, and moreover, g is 
bounded near the origin by assumption (iii) of the theorem. Hence 0 is 
a removable singularity for g, and lim z —o g(z) = f ( 丁 ) exists. 

So by the maximum modulus principle, 

T I™ |/(r)| < sup |/(r)|. 

Now we must investigate the behavior of / at the points r = ±1. Since 
f(r + 2) = /(r), it suffices to consider the point r = 1. We claim that 

T I™ /(!- V^) 

lm(r)—>^oo 

exists and moreover 


T I™ |/(1 - l/r)\ < sup |/(r)|. 

Im(r)—)-oo 


The argument is essentially the same as the one above, except that we 
first need to interchange r = 1 with the point at infinity. In other words, 
we wish to investigate the behavior of F{r) = /(I — 1/r) for r near oo. 
The important step is to prove that F is periodic. To this end, we 
consider the fractional linear transformation associated to the matrix 


U n 



1 — n 
—n 


n 

1 + n 


that is, 


(1 — n)r + n 
—nr + (1 + n) 


which maps 1 to 1. Now let //(r) = 1/(1 — r) which maps 1 to oo, and 
whose inverse /i _1 (r) = 1 — 1/r takes oo to 1. Then 

U n = p T n fi， 

where T n is the translation T n (r) = t + n. As a consequence, 

UnUm = Un-\-mi 


and 
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Thus any U n can be obtained by finitely many applications of T 2 , or 
their inverses. Since / is invariant under T 2 and it is also invariant 
under U m . So we find that 

/(" _1 T n "(T)) = /(r). 

Therefore, if we let F{r) = /(" _1 (t)) = /(I — 1/r), we find that F is 
periodic of period 1, that is, 

F(T n r) = F ( 丁 ) for every integer n. 

Now, by the previous argument, if we set h(z) = F ( 丁 ) with 2： = e 2 窗 , we 
see that h has a removable singularity at 2 : = 0, and the desired inequality 
follows by the maximum principle. 

We conclude from this analysis that / attains its maximum in the inte¬ 
rior of the upper half-plane, and this contradicts the maximum principle. 

The proof of the two-squares theorem is now only one step away. 

We consider the function f ( 丁 ) = C{r)/6{r) 2 . Since we know by the 
product formula that 9{r) does not vanish in the upper half-plane (Corol¬ 
lary 1.4), we find that / is holomorphic in H. Moreover, by Propo¬ 
sition 3.3, / is invariant under the transformations T 2 and 5, that is, 
f(r + 2) = /(r) and /(—1/r) = /(r). Finally, in the fundamental do¬ 
main J 7 , the function /(r) is bounded, and in fact tends to 1 as Im(r) 
tends to infinity, or as r tends to the cusps 士 1. This is because of proper¬ 
ties (iii) and (iv) in Proposition 3.3, which are verified by both C and 6 2 . 
Thus / is bounded in H. The result is that / is a constant, which must 
be 1, proving that 0 (t) 2 = C(r), and with it the two-squares theorem. 


3.2 The four-squares theorem 
Statement of the theorem 

In the rest of this chapter, we shall consider the case of four squares. 
More precisely, we will prove that every positive integer is the sum of 
four squares, and moreover we will determine a formula for 7 * 4 ( 71 ) that 
describes the number of ways this can be done. 

We need to introduce another divisor function, which we denote by 
and which equals the sum of divisors of n that are not divisible 
by 4. The main theorem we shall prove is the following. 

Theorem 3.6 Every positive integer is the sum of four squares, and 
moreover 


r 4 (n) = 8 ( 7 ^ (n) for all n > 1 . 
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As before, we relate the sequence {r^n)} via its generating function 
to an appropriate power of the function which in this case is its fourth 
power. The result is that 


0(r) 4 -^r 4 (n)^ 

n=0 

whenever q = e 7Vlr with r G H. 

The next step is to find the modular function whose equality with 
0(r) 4 expresses the identity r^{n) = 8al(n). Unfortunately, here there 
is nothing as simple as the function C(r) that arose in the two-squares 
theorem; instead we shall need to construct a rather subtle variant of the 
Eisenstein series considered in the previous chapter. In fact, we define 


與 ⑺ =EE 




EE. 


mr + 


for r G H. The indicated order of summation is critical, since the above 
series do not converge absolutely. The following reduces the four-squares 
theorem to the modular properties of 

Proposition 3.7 The assertionr^n) = 8al(n) is equivalent to the iden¬ 
tity 

0( 丁 ) 4 = — where r G H. 

7T Z 


Proof. It suffices to prove that if g = e 71-27- , then 


k=l 


First, recall the forbidden Eisenstein series that we considered in the 
last section of the previous chapter, and which is defined by 


F ⑺二 

m 


E 


(mr + n) 2 


where the term n = m = 0 is omitted. Since the sum above is not abso¬ 
lutely convergent, the order of summation, first in n and then in m, is 
crucial. With this in mind, the definitions of and F give immediately 

(10) E*{t)^f(^)-AF{2t). 
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In Corollary 2.6 (and Exercise 7) of the last chapter, we proved that 
F(r)^^--8n 2 f2Mk)e^ ik \ 


where (J\{k) is the sum of the divisors of k. 
Now observe that 


4(n)= 


o-i (n) 

<Ji(n) — 4ai(n/4) 


if n is not divisible by 4, 
if n is divisible by 4. 


Indeed, if n is not divisible by 4, then no divisors of n are divisible by 
4. If n = 4fi, and d is a divisor of n that is divisible by 4, say d = 4d, 
then d divides n. This gives the second formula. Therefore, from this 
observation and (10) we find that 


E*{t) - -7T 2 - 8tt 2 CTl *(fc)〆' 

k=l 

and the proof of the proposition is complete. 

We have therefore reduced Theorem 3.6 to the identity 9 4 = — 丌 _2 £^, 
and the key to establish this relation is that Eg satisfies the same modular 
properties as 0(r) 4 . 

Proposition 3.8 The function E^r) defined in the upper half-plane has 
the following properties: 

(i) E* 2 (t + 2)^E*(t). 

(ii) E*(t) = -t- 2 E*(-1/t). 

(iii) E^t) —>• —7r 2 as Im(r) —»• oo. 

(iv) 网 (1 — l/r)| = 0(\r 2 e KlT \) as Im(T) — oo. 

Moreover —7t 2 6 4 has the same properties. 

The periodicity (i) of is immediate from the definition. The proofs 
of the other properties of are a little more involved. 

Consider the forbidden Eisenstein series F and its reverse F, which is 
obtained from reversing the order of summation: 

F(t) = --- -72 and F(r) = --- - 

^ ^ (mr + n) 2 ^ ^ (mr + n) 2 

mn nm 

In both cases, the term n = m = 0 is omitted. 
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Lemma 3.9 The functions F and F satisfy: 

(a) F(-l/r) - t 2 F(t), 

(b) F{t) — F{t) = 2th j t, 

(c) F(—1/r) = t 2 F{t) — 27tzt. 

Proof. Property (a) follows directly from the definitions of F and P, 
and the identity 


(n + m(—1/r)) 2 = r~ 2 {—m + nr) 2 . 

To prove property (b), we invoke the functional equation for the Dedekind 
eta function which was established earlier: 


= \JV[i T](r), 

where r/(r) = q 1 ’ 12 n 二 i(l — q 2n ), and q = e 霄 . 

First, we take the logarithmic derivative of r] with respect to the vari¬ 
able r to find (by Proposition 3.2 in Chapter 5) 


W/v) ( 丁 ) 


7TZ 

12 




nq 2n 
1 — g 2n 


However, if (7i(k) denotes the sum of the divisors of fc, then one sees that 


oo 

E 

71=1 


nq 


2n 



oo oo 

= EEW n 

n=l £=0 


oo oo 

二 n ， 

n=l m=l 


二 

k=l 


If we recall that F{r) = 7r 2 /3 — 87r 2 Ylh=i we find 

(r//r^ T ) = 士 F(t)_ 

By the chain rule, the logarithmic derivative of 7/(—1/r) is T~ 2 (rj f /rj)(—l/r) 
and using property (a), we see that the logarithmic derivative of ry(—1/r) 
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equals (i/An)F(r). Therefore, taking the logarithmic derivative of the 
functional equation for 77 we find 




and this gives F{r) = —2m/r + F(r), as desired. 

Finally, (c) is a consequence of (a) and (b). 

To prove the transformation formula (ii) for under r 1 ——1/r, we 
begin with 


S*(r)=F(r/2)-4F(2r). 


Then 


E* 2 (-1/t) - F(-l/(2r))- 4F(-2/r) 

二 [4r 2 F(2r) — Anir] — 4[(r/2) 2 F(r/2) — nir] 

= 4t 2 F(2t) - 4(r 2 /4)F(r/2) 

二 -r 2 (F(r/2) - 4F(2 t)) 

as desired. To prove the third property recall that 

F(r)^^--8n 2 f2Mk)e^ ikT , 

k=l 

where the sum goes to 0 as Im(r) —»• 00 . Then, if we use the fact that 

E* 2 (t)^F(t/2)-AF(2t), 

we conclude that ^ — 7r 2 as Im(r) —>• 00 . 

To prove the final property, we begin by showing that 

( 11 ) 玛 (1 —l/r) 二丁 2 F^^j-F(r/2). 

From the transformation formulas for F we have 

F(1/2-1/2t) = F 
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and 


F 


2r 、 
1 — r, 


F(-2 + 2/(l-r)) 




1 — r 


F 


r — 1 


2iri 


r — 1 


Hence, 


F(1/2-1/2t)^t 2 F 


T — 1 


27rz2r . (2r) 2 fr — 1 
— 2ni- 


1 


(r-1) 2 


2 


But F(2 — 2/r) = F(—2/r) = (r 2 /4)F(r/2) — 27rzr/2, and hence 
E;(l- 1/r) = F(l/2 - l/2r) - 4F(2 - 2/r) 


F 


F 


r — 1 

V — 1 


-F("2) 

- 阶 /2) 


— 27TZ 


2 丁 


2r 2 


1 — T T — 


+ 丁 


This proves (11). Then, the last property follows from it and the fact 
that 


F(r) 




8 丌 2 (Ji(fc)e 2 


ivikr 


Thus Proposition 3.8 is proved. 

We can now conclude the proof of the four-squares theorem by consid¬ 
ering the quotient /(t) = _E^(t)/0(t) 4 , and applying Theorem 3.4, as in 
the two-squares theorem. Recall 0(r) 4 —^ 1 and 9(1 — 1/r) 4 ~ lQr 2 e 7rlT , 
as Im(r) —>• oo. The result is that /(r) is a constant, which equals —7r 2 by 
Proposition 3.8. This completes the proof of the four-squares theorem. 


4 Exercises 


1. Prove that 

( ㊀ 別丁 )) 2 - €)(♦)©"(♦) 
0(z|r) 2 


— pr [z ~ 1/2 — t / 2 ) + c T , 


where Ct can be expressed in terms of the first two derivatives of ㊀ with 
respect to at 2 = 1/2 + r/2. Compare this formula with the result in Exercise 5 
in the previous chapter. 
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2. Consider the Fibonacci numbers {-F n }^Lo? defined by the two initial values 
Fq = 0, Fi = 1 and the recursion relation 


Fn = F n -1 + F n -2 for n > 2. 

(a) Consider the generating function F{x) = 凡 3 ^ associated to 

and prove that 

F{x) = x 2 F{x) + xF(x) + x 


for all x in a neighborhood of 0. 

(b) Show that the polynomial q{x) — \ — x — x 2 can be factored as 


q(x) = (1 — ax)(l — /3x), 

where a and /3 are the roots of the polynomial p(x) = x 2 — x — 1. 

(c) Expand the expression for F in partial fractions and obtain 


尸 ㈤ 


A 


B 


1 — x — x 2 (1 — ax)(l — (3x) 1 — ax 1 — /3x' 


where A = 1/(a — /3) and B = 1/(/3 — a). 


(d) Conclude that F n = Aa n + Bf3 n for n > 0. The two roots of p are actually 


1 + ^5 


and 


P = 


l-\/5 

~ 2 ~ 


so that A = 1/y/E and B = — 1/a/5- 


The number 1/a = (\/5 — 1)/2, which is known as the golden mean, satisfies 
the following property: given a line segment [AC] of unit length (Figure 2), there 
exists a unique point B on this segment so that the following proportion holds 


AC AB 
~AB ^ ~BC' 


If £ = AB, this reduces to the equation £ 2 — 1 = 0, whose only positive solu¬ 

tion is the golden mean. This ratio arises also in the construction of the regular 
pentagon. It has played a role in architecture and art, going back to the time of 
ancient Greece. 


3. More generally, consider the difference equation given by the initial values uo 
and tii, and the recurrence relation u n = au n -i + bu n -2 for n > 2. Define the 
generating function associated to {ti n }^=o by U{x) = u n x n . The recurrence 

relation implies that U(x)(l — ax — bx 2 ) = wo + [u\ — auo)x in a neighborhood of 
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A B C 


Figure 2. Appearance of the golden mean 


the origin. If a and [3 denote the roots of the polynomial p(x) = x 2 — ax — b, then 
we may write 


U(x)= 


uq + (tii — auo)x 
(1 — ax)(l — fix) 


A B 

1 - ax + (1 - f3x) 


oo 

A > a x 

n=0 


oo 

+ B^(3 n x n , 

n=0 


where it is an easy matter to solve for A and B. Finally, this gives u n = Aa n + 
B/3 n . Note that this approach yields a solution to our problem if the roots of p 
are distinct, namely a ^ /3. A variant of the formula holds if a = /3. 


4. Using the generating formula for p(n), prove the recurrence formula 
p(n) = p(n — 1) + p(n — 2) — p(n — 5) — p(n — 7) — • • • 

k^O 乂 ’ 

where the right-hand side is the finite sum taken over those A; G Z, /c _ 0, with 
k(3k + 1)/2 < n. Use this formula to calculate p(5), p(6), p(7), p(8), p(9), and 
p(10); check that p(10) = 42. 


The next two exercises give elementary results related to the asymptotics of the 
partition function. More refined statements can be found in Appendix A. 


5. Let 




n=0 

be the generating function for the partitions. Show that 

ji 


log F(x) 


as a; —> 1, with 0 < a; < 1. 


6(1 — x) 

[Hint: Use log F(x) = ^ log(l/(l — x n )) and log(l/(l — x n )) = ^2(l/m)x nrn , so 

logF(x) = ^2 • 


m 1 — x 11 


Use also mx rn ~ 1 (l — x) < 1 — x 771 < m(l — x).] 
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6. Show as a consequence of Exercise 5 that 

e cinl/2 < p(n) < e C2nl/2 
for two positive constants ci and C 2 . 

[Hint: F(e~ y ) = ^2p(n)e~ ny < Ce c ! v as y —> 0. So p(n)e _ny < ce c ^ y . Take y = 
1/n 1 〆 2 to get p(n) < c’e c nl/2 . In the opposite direction 

m oo 

J2p(n)e~ ny >C(e c/y - e cnl/2 e~ ny ), 

n=0 n=m+l 

and it suffices to take y = Am -1 ^ 2 where A is a large constant, and use the fact 
that the sequence p(n) is increasing.] 

T. Use the product formula for ㊀ to prove: 

(a) The “triangular number” identity 

f[{l + x n )(l-x 2rl+2 )= s n( ™ +1)/2 , 

n=0 n=—oo 

which holds for |a:| < 1. 

(b) The “septagonal number” identity 

oo oo 

Y[{1 - x 5n+1 )(l - x 5n+4 )(l - x 5n+5 ) = Y, (-l) n x n< ~ 6n+3)/2 , 

n=0 n= — oo 

which holds for |a:| < 1. 

8. Consider Pythagorean triples (a, b, c) with a 2 + b 2 = c 2 , and with a, 6, c G Z. 
Suppose moreover that a and b have no common factors. 

(a) Show that either a or b must be odd, and the other even. 

(b) Show in this case (assuming a is odd and b even) that there are integers 
m,n so that a = m 2 — n 2 , b = 2mn, and c = m 2 + n 2 . [Hint: Note that 
b 2 = (c — a)(c + a), and prove that (c — a)/2 and (c + a)/2 are relatively 
prime integers.] 

(c) Conversely, show that whenever c is a sum of two-squares, then there exist 
integers a and b such that a 2 -\-b 2 = c 2 . 


9. Use the formula for r 2 (n) to prove the following: 

(a) lin = p, where p is a prime of the form Ak + 1, then r 2 (n) = 8. This implies 

that n can be written in a unique way as n = n? + , except for the signs 

and reordering of n\ and ri 2 . 

(b) If n = q a , where q is prime of the form 4/c + 3 and a is a positive integer, 
then r 2 (n) > 0 if and only if a is even. 
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(c) In general, n can be represented as the sum of two squares if and only if all 
the primes of the form 4/c + 3 that arise in the prime decomposition of n 
occur with even exponents. 


10. Observe the following irregularities of the functions r 2 (n) and r 4 (n) as n 
becomes large: 

(a) r* 2 (n) = 0 for infinitely many n, while limsup n — ⑺ r 2 (n) = oo. 

(b) 7*4(71) = 24 for infinitely many n while lim sup n _ ) . 00 r 4 (n)/n = oo. 

[Hint: For (a) consider n = 5 fc ; for (b) consider alternatively n = 2 k , and n = q k 
where q is odd and large.] 


11. Recall from Problem 2 in Chapter 2, that 

oo oo n 

= ni < 1 

n=l n=l 

where d(n) denotes the number of divisors of n. 

More generally, show that 


n=l n=l 


r/z n 
1- z n 


W<1 


where cr^(n) is the sum of the £ th powers of divisors of n. 


12. Here we give another identity involving 6 4 , which is equivalent to the four¬ 
squares theorem. 

(a) Show that for \q\ < 1 

OO n OO „ 

V nq = V q 

^ 1 — q n ^ (1 — q n ) 2 

n=l H n=l v H J 

[Hint: The left-hand side is ^ ai(n)q n . Use x/(l — x) 2 = nx n .] 

(b) Show as a result that 


E 


nq n 


q n 


E 


4ng 4n 
1 q 4n 


E 


(1 — q n ) 2 




(1 — q 4n ) 2 


J2 a i( n ^ n 


where cr^ (n) is the sum of the divisors of d that are not divisible by 4. 

(c) Show that the four-squares theorem is equivalent to the identity 


oo 

6> ⑺ 4 = 1 + 8 ^ 

n=l 


q n 

(l-\-(~l) n q n ) 2 
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5 Problems 

1. * Suppose n is of the form n = 4 a (8A: + 7), where a and k are positive integers. 
Show that n cannot be written as the sum of three-squares. The converse, that 
every n that is not of that form can be written as the sum of three-squares, is a 
difficult theorem of Legendre and Gauss. 

2. Let SL 2 (Z) denote the set of 2 x 2 matrices with integer entries and determinant 
1, that is, 

SL 2 (Z) = : : ) : a, b, c, d E Z and ad — be = lj . 

This group acts on the upper half-plane by the fractional linear transformation 
g(T) = (err + 6 )/(ct + d). Together with this action comes the so-called funda¬ 
mental domain T\ in the complex plane defined by 

= {t G C : |r| > 1, |Re(r)| <1/2 and |Im(r)| > 0}. 

It is illustrated in Figure 3. 



Figure 3. The fundamental domain T\ 


Consider the two elements in SL 2 (Z) defined by S(r) = — 1/t and Ti(r) = t + 1. 
These correspond (for example) to the matrices 



respectively. Let g be the subgroup of SL 2 (Z) generated by S 1 and J\. 

(a) Show that for every r G H there exists g G Q such that g(r) G T\. 

(b) We say that two points r and r are congruent if there exists g G SL 2 (Z) such 
that g(r) = w. Prove that if r, it; G T\ are congruent, then either Re(r)= 
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土 1/2 and t’ = t 干 1 or |t| = 1 and r = —1/r. [Hint: Say r = gir). Why 
can one assume that Im(r / ) > Im(r), and therefore \cr + ci| < 1? Now con¬ 
sider separately the possibilities c = —1, c = 0, or c = 1.] 

(c) Prove that S and T\ generate the modular group in the sense that every 
fractional linear transformation corresponding to p G SL 2 (Z) is a composi¬ 
tion of finitely many S°s and TVs, and their inverses. Strictly speaking, the 
matrices associated to S and Ti generate the projective special linear group 
PSL 2 (Z), which equals SL/ 2 (Z) modulo 士 /. [Hint: Observe that 2i is in the 
interior of T\. Now map g(2i) back into T\ by using part (a). Use part (b) 
to conclude.] 


3. In this problem, consider the group G of matrices ^ ^ ^ J with integer 

entries, determinant 1, and such that a and d have the same parity, b and c have 
the same parity, and c and d have opposite parity. This group also acts on the 
upper half-plane by fractional linear transformations. To the group G corresponds 
the fundamental domain T defined by |r| > 1, |Re(r)| < 1, and Im(r) 2 0 (see 
Figure 1). Also, let 



Prove that every fractional linear transformation corresponding to ^ G G is a 
composition of finitely many S, Tb and their inverses, in analogy with the previous 
problem. 

4. Let G denote the group of matrices given in the previous problem. Here we 
give an alternate proof of Theorem 3.4, that states that a function in H which is 
holomorphic, bounded, and invariant under G must be constant. 

(a) Suppose that / : H —>• C is holomorphic, bounded, and that there exists a 
sequence of complex numbers Tk = Xk -\- iyk such that 


f(jk) = 0, y^y fc = oo, 0 < < 1, and \x k \ < 1. 


Then f = 0. [Hint: When Xk = 0 see Problem 5 in Chapter 8.] 

(b) Given two relatively prime integers c and d with different parity, show that 



there exist integers a and b such that 


lutions of xc dy = 1 take the form xq + dt and yo — ct where xo , yo is a 
particular solution and t G Z.] 

(c) Prove that l/(c 2 + d 2 ) = oo where the sum is taken over all c and d 
that are relatively prime and of opposite parity. [Hint: Suppose not, and 
prove that [( a & ) =1 l/(a 2 + 6 2 ) < oo where the sum is over all relatively 
prime integers a and b. To do so, note that if a and b are both odd and 
relatively prime, then the two numbers c and d defined by c= (a+ 6)/2 
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and d = (a — 6)/2 are relatively prime and of opposite parity. Moreover, 
c 2 -\-d 2 < A(a 2 + b 2 ) for some universal constant A. Therefore 

n 2 a 2 _|_ ^2 < °°, 

n^O (a,6)=1 

hence ^ l/(/c 2 + £ 2 ) < oo, where the sum is over all integers k and l such 
that k,£ ★ 0. Why is this a contradiction?] 

(d) Prove that if F : H ^ C is holomorphic, bounded, and invariant under 
G, then F is constant. [Hint: Replace F(r) by F{r) — F(i) so that we 
can assume F{i) = 0 and prove F = 0. For each relatively prime c and 
d with opposite parity, choose g E G so that g{i) = x c ,d + i/(c 2 + d 2 ) with 

I 工 c ， d| S 1.] 


5.* In Chapter 9 we proved that the Weierstrass p function satisfies the cubic 
equation 

(p) 2 = 4p 3 — g2p~ g3, 

where 仍 = 60^4, = 140^6, with Ek is the Eisenstein series of order k. The 

discriminant of the cubic y 2 = 4x 3 — g^x — gz is defined by A = — 27^|. Prove 

that 

A(r) = (2n) 12 r] 24: (r) for all r G H. 

[Hint: A and ry 24 satisfy the same transformation laws under t !—>• + 1 and r i— ^ 

—1/r. Because of the fundamental domain described in Problem 2, it suffices then 
to investigate the behavior at the only cusp, which is at infinity.] 


6 .* Here we will deduce the formula for rs^n), which counts the number of repre¬ 
sentations of n as a sum of eight squares. The method is parallel to that of r 4 (n), 
but the details are less delicate. 

Theorem: rs(n) = I 6 C 73 (n). 

Here cr^(n) = ( 73 ( 71 ) = d 3 , when n is odd. Also, when n is even 

= ^(-1)V = (jt(n) - 0 - 3 (n), 

d\n 


where a|(n) = Ed|n, d even d 3 and 的⑻ = Ed|„, d odd 
Consider the appropriate Eisenstein series 


E ： {r) = Y J 


d\ 


(n + mr) 4 


where the sum is over integers n and m with opposite parity. Recall the standard 
Eisenstein series 

五 4(T)= ^2 


1 


(n + mr) 4 

(n ， m)#(0,0) V 

Notice that the series defining El is absolutely convergent, in distinction to Eg{r), 
which arose when considering r 4 (n). This makes some of the considerations below 
quite a bit simpler. 
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(a) Prove that rs (n) = 160*3 (n) is equivalent to the identity 0(r) 

[Hint: Use the fact that E^{r) = 2((4) + ( 2 ;) (J 3 (k)e 

tt 4 /90.] 

(b) Note that Et{r) = E 4 (r) - 2 -4 五 4 ((t — 1)/2). 


8 = 48tt- 4 ^(t) 
27rikT and C(4)= 


(c) EI{t + 2) = EI{t). 

(d) EI{t) = t-^EI{-1/t). 

(e) ( 48 / 7 t 4 )^|(t) — 1 as t — 00 . 

(f) 1^(1 — l/r)| ^ |r| 4 |e 27rir |, as Im(r) ^ 00 . [Hint: Verify that 拉 （1 — 1/V)= 
t\E^t)-E a {2t))] 

Since 0(r) 8 satisfies properties similar to (c), (d), (e) and (f) above, it follows that 
the invariant function 487 r _ 4 ^ 4 (r)/^(r ) 8 is bounded and hence a constant, which 
must be 1. This gives the desired result. 
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On the numerical computation of the definite integral 
f w cos j(w 3 — m.w), between the limits 0 and 吾 . 

The simplicity of the form of this differential 
coefficient induces me to suppose that the integral 
may possibly be expressible by some of the integrals 
whose values have been tabulated. After many at¬ 
tempts however, I have not succeeded in reducing it 
to any known integral: and I have therefore computed 
its value by actual summation to a considerable extent 
and by series for the remainder. 

G. B. Airy, 1838 


In a number of problems in analysis the solution is given by a function 
whose explicit calculation is not tractable. Often a useful substitute (and 
the only recourse) is to study the asymptotic behavior of this function 
near the point of interest. Here we shall investigate several related types 
of asymptotics, where the ideas of complex analysis are of crucial help. 
These typically center about the behavior for large values of the variable 
5 of an integral of the form 

r b 

(1) I(s ) 二 e~ s ^ dx. 

J a 

We organize our presentation by formulating three guiding principles. 

(i) Deformation of contour. The function $ is in general complex¬ 
valued, therefore, for large s the integrand in (1) may oscillate 
rapidly, so that the resulting cancellations mask the true behavior 
of I(s). When $ is holomorphic (which is often the case) one can 
hope to change the contour of integration so that as far as possible, 
on the new contour $ is essentially real-valued. If this is possible, 
one can then hope to read off the behavior of I(s) in a rather direct 
manner. This idea will be illustrated first in the context of Bessel 
functions. 


(ii) Laplace’s method. In the case when $ is real-valued on the 
contour and s is positive, the maximum contribution to I(s) comes 
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from the integration near a minimum of and this leads to a 
satisfactory expansion in terms of the quadratic behavior of $ near 
its minimum. We apply these ideas to present the asymptotics of 
the gamma function (Stirling’s formula), and also those of the Airy 
function. 

(iii) Generating functions. If {F n } is a number-theoretic or combina¬ 
torial sequence, we have already seen in several examples that one 
can exploit analytic properties of the generating function, F(u)= 
^ F n u n ^ to obtain interesting conclusions regarding {F n }. In fact 
the asymptotic behavior of _F n , as n —> oo, can also be analyzed 
this way, via the formula 

F n ^ f F{e 27Tiz )e- 2ninz dz. 

Here 7 is an appropriate segment of unit length in the upper half¬ 
plane. This formula can then be studied as a variant of the in¬ 
tegral (1). We shall show how these ideas apply in an important 
particular case to obtain an asymptotic formula for p(n), the num¬ 
ber of partitions of n. 


1 Bessel functions 

Bessel functions appear naturally in many problems that exhibit rota¬ 
tional symmetries. For instance, the Fourier transform of a spherical 
function in is neatly expressed in terms of a Bessel function of order 
{d/2) — 1. See Chapter 6 in Book I. 

The Bessel functions can be defined by a number of alternative formu¬ 
las. We take the one that is valid for all order v > —1/2, given by 


( 2 ) 


= 


{s/2Y 

i> + i/2)r(i/2) 




x ' 2 dx. 


If we also write J-\/ 2 { s ) for / 2 we see that it equals 

cos s; observe in addition that J" 2 ( 5 ) = sin s. However, J u {s) 
has an expression in terms of elementary functions only when v is half¬ 
integral, and understanding this function in general requires further anal¬ 
ysis. Its behavior for large 5 is suggested by the two examples above. 


Theorem 1.1 J u (s) 


COS (S _ ? _ + (9 (S _ 3/ 2 ) as 5 — > oo. 
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In view of the formula for J u (s) it suffices to investigate 

(3) I(s) 二 j e isx (l - x 2 )^ 1 / 2 dx, 

and to this end we consider the analytic function f(z) = e lsz (l — 之 2 )" _1 / 2 
in the complex plane slit along the rays (— 00 , —1) U (1, 00 ); for 
(1 — z 2 y~ x l 2 we choose that branch that is positive when z = x e (—1,1). 
With 5 > 0 fixed, we apply Cauchy’s theorem to see that 

/(s) = —I-(s) — /+(«), 

where the integrals I(s), and I+(s) are taken over the lines shown 

in Figure 1. This is established by using the fact that r f(z) dz = 0 
where is the second contour of Figure 1, and letting e —> 0 and 

i? —> CXD. 





1 


—1 + iR l-\- iR 



Figure 1. Contours of integration of /(s), I—{s'), /+(5), and the contour 

7e,i? 


On the contour for /+(5) we have z = 1 -\-iy^ so 

⑷ I + (s ) 二 ie is f e~ sy {l - (1 + iy) 2 ) u ~ 1/2 dy. 

Jo 

There is a similar expression for /_(s). 

What has the passage from I(s) to +i+(s)) gained us? Ob¬ 

serve that for large positive s, the exponential e lsx in (3) oscillates 
rapidly, so the estimation of that integral is not obvious at first glance. 
However, in (4) the corresponding exponential is e~ sy , and it decreases 
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rapidly as 5 —^ oo, except when y = 0. Thus in this case one sees im¬ 
mediately that the main contribution to the integral comes from the 
integration near y = 0, and this allows one readily to approximate this 
integral. This idea is made precise in the following observation. 

Proposition 1.2 Suppose a and m are fixed, with a > 0 and m > —1. 
Then as s ^ oo 


(5) 



s -m-1 r(m+ l) + 0(e -cs ), 


for some positive c. 

Proof. The fact that m > —1 guarantees that J Q a e~ sx x rn dx = 
lim e ^o f: e~ sx x m dx exists. Then, we write 



The first integral on the right-hand side can be seen to equal 
1 ), if we make the change of variables x i—^ x/s. For the 
second integral we note that 


( 6 ) 


dx 


e s{x-c) x r, 


dx = 0(e _cs ), 


as long as c < a, and so the proposition is proved. 

We return to the integral (4) and observe that 

(1-(1+ iy) 2 Y~ 1/2 = (-2iyY~ 1/2 + 0(f +1/2 )， for 0 < 2 / < 1, 


while 

(1-(1+ 切 ) 2 广 1/2 = 0{y u -^ 2 + y 2 "- 1 ), for 1 < y. 

So, applying the proposition with a = 1 and m = 厂干 1/2, as well as (6), 
gives 


J + (s ) 二 i(-2*) !/ - 1 / 2 e is s- I/ - 1 / 2 r(^ + 1/2) + 0{ s - v ~ z / 2 ). 

Similarly, 

J_(s) = i{2iy~ l/2 e is s~ v ~ x/2 V(y + 1/2) + 0(s _i/_3/2 ). 
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If we recall that 

Ju{s) ^ r(^+ ( i/2)r(i/2) [_/ - (s) — /+(s)]! 

and the fact that r(l/ 2 ) = we see that we have obtained the proof 
of the theorem. 

For later purposes it is interesting to point out that under certain 
restricted circumstances, the gist of the conclusion in Proposition 1.2 
extends to the complex half-plane Re(s) > 0. 


Proposition 1.3 Suppose a and m are fixed, with a > 0 and 
—1 < m < 0. Then as |s| —>■ oc with Re(s) > 0 7 

e_ sx x m dx = s -m-1 r(m+ 1) + 0(l/|s|). 



(Here 5 _m_1 is the branch of that function that is positive for s > 0^). 


Proof. We begin by showing that when Re(s) > 0, s ^ 0, 



lim 

iV—>-oo 



dx 


exists and equals s _m_ 1 r(m + 1). If N is large, we first write 




'dx + 



dx. 


Since m > —1, the first integral on the right-hand side defines an analytic 
function everywhere. For the second integral, we note that = 

e~ sx , so an integration by parts gives 


⑺ 



e~ sx x rn dx= — 
s 



dx — 


N 


This identity, together with the convergence of the integral x m_ 1 dx, 
shows that e~ sx x rn dx defines an analytic function on Re(s) > 0 that 
is continuous on Re(s) > 0, s # 0. Thus J 0 °° e~ sx x rn dx is analytic on the 
half-plane Re(s) > 0 and continuous on Re( 5 ) > 0, s 7 ^ 0. Since it equals 
5 _m_ 1 r(m+l) when s is positive, we deduce that J 0 °° e~ sx x rn dx = 
5 _m_ 1 r(m + 1) when Re( 5 ) > 0, s 7 ^ 0. 
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However, we now have 



It is clear from (7), and from the fact that m < 0, that if we let N oo, 
then J a °° e~ sx x rn ~ 1 dx = 0(l/|«s|). The proposition if therefore proved. 

Note. If one wants to obtain a better error term in Proposition 1.3, 
or for that matter extend the range of m, then one needs to mitigate the 
effect of the contribution of the end-point x = a. This can be done by 
introducing suitable smooth cut-offs. See Problem 1. 


2 Laplace’s method; Stirling’s formula 

We have already mentioned that when $ is real-valued, the main contri- 
bution to J a dx as s —>• oo comes from the point where $ takes its 

minimum value. A situation where this minimum is attained at one of 
the end-points, a or 6, was considered in Proposition 1.2. We now turn 
to the important case when the minimum is achieved in the interior of 

[a,b\- 

Consider 


e~ s ^^ x ^{x) dx 

where the phase $ is real-valued, and both it and the amplitude ^ 
are assumed for simplicity to be indefinitely differentiable. Our hypoth¬ 
esis regarding the minimum of $ is that there is an xq G (a, b) so that 
$’(a ： o) =0, but $ 〃 (xo) > 0 throughout [a, b] (Figure 2 illustrates the 
situation.) 



Proposition 2.1 Under the above assumptions, with s > 0 and s —> oo, 


⑻ 

where 


e~ s ^ x) ip(x) dx ^ e - _。) 


vr^ + °{\ 


A = 




($ ， 0 ))V2_ 


Proof. By replacing by $(x) — $(xo) we may assume that 

$(xq) = 0. Since ^ f (xo) = 0, we note that 


$(x) 

(x- x 0 ) 2 


^(x 0 ) 

~2~ 


咖)， 
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Figure 2. The function with its minimum at xq 


where cp is smooth, and (f(x) = 1 + 0(x — Xo) as x —»• : To. We can there¬ 
fore make the smooth change of variables x ^ y = (x — Xo)((p(x)) 1 ^ 2 in 
a small neighborhood of x = xq, and observe that 工。 = 1， and 

thus dx/dy = 1 + 0(y) as y ^ 0. Moreover, we have ^(x) = ^(y) with 
^(y) = ^(xq) + 0(y) as ?/ —> 0. Hence if [a\ b r ] is a sufficiently small in¬ 
terval containing xq in its interior, by making the indicated change of 
variables we obtain 

⑼， 

f e~ s ^^ x ^(x) dx = *0(^o) f e _ ’ ^° )y2 dy + O ^ f e _ ’ ^° )y2 \y\dy 
J a' J a \Jcx 

where a < 0 < /3. We now make the further change of variables y 2 = X, 
dy = \X~ x ! 2 dX, and we see by (5) that the first integral on the right- 
hand side in (9) is 

1 /2 

- s ^nlx x -i /2 dX + 0 ( e -5 s ) = s -i /2 + 0{e~ Ss ), 

for some 5 > 0. By the same argument, the second integral is 0(l/s). 
What remains are the integrals of over [a, a r ] and [6’, 6]; but 

these integrals decay exponentially as 5 —> oo, since ^(x) > c > 0 in these 
two sub-intervals. Altogether, this establishes (8) and the proposition. 

It is important to realize that the asymptotic relation (8) extends to 
all complex 5 with Re(5) > 0. The proof, however, requires a somewhat 
different argument: here we must take into account the oscillations of 
e -s$(x) when Is| is large but Re(s) is small, and this is achieved by a 
simple integration by parts. 
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Proposition 2.2 With the same assumptions on $ and ifj，the rela¬ 
tion (8) continues to hold if \s\ — »• oo with Re(s) > 0. 

Proof. We proceed as before to the equation (9), and obtain the 
appropriate asymptotic for the first term, by virtue of Proposition 1.3, 
with m = —1/2. To deal with the rest we start with an observation. If 屯 
and ^ are given on an interval [a, b ], are indefinitely differentiable, and 
> 0, while |^ r/ (x)| > c > 0, then if Re(s) > 0, 

(10) J dx = O as \s\ — > oo. 

Indeed, the integral equals 


dx 


，截 -， 


which by integration by parts gives 


dx 


dx 


D —s^(x) 


矽 ㈤ 

屯’(: C) 


The assertion (10) follows immediately since | e - s ^(^)| < i 7 when 
Re(s) > 0. This allows us to deal with the integrals of e~ s ^^ x ^{x) in 
the complementary intervals [a, a r ] and [6’，6], because in each, |$’(3；)| 之 
c > 0, since $’($o) = 0 and ^ ,r {x) > Ci > 0. 

Finally, for the second term on the right-hand side of (9) we observe 
that it is actually of the form 


rP 


中 " （ xq) 2 

! 2 v m{y) dy, 


where r](y) is differentiable. Then we can again estimate this term by 
integration by parts, once we write it as 



s$"(a: 0 ) 


^> // (a3n) 2\ 

~^~ v ) ri(y) dy, 


obtaining the bound 0(l/|s|). 

The special case of Proposition 2.2 when s is purely imaginary, s = it, 
t —>■ 士 oo, is often treated separately; the argument in this situation is 
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usually referred to as the method of stationary phase. The points xq 
for which $’(xo) = 0 are called the critical points. 

Our first application will be to the asymptotic behavior of the gamma 
function T, given by Stirling’s formula. This formula will be valid in any 
sector of the complex plane that omits the negative real axis. For any 
5 > 0 we set 5(5 = {5 : | args| < 7r — 5}, and denote by log 5 the principal 
branch of the logarithm that is given in the plane slit along the negative 
real axis. 


Theorem 2.3 If\s\ —> cxd with s ^ Ss, then 


(ii) 


r(s) = e slogs e- s 


y/2n 

VJ 1 


1 + 0 


.Nl 1/2 , 


Remark. With a little extra effort one can improve the error term to 
0(1/|5|), and in fact obtain a complete asymptotic expansion in pow¬ 
ers of I/ 5 ; see Problem 2. Also, we note that (11) implies r(s ) 〜 
V^^5 s_1 / 2 e _s , which is how Stirling’s formula is often stated. 


To prove the theorem we first establish (11) in the right half-plane. We 
shall show that the formula holds whenever Re(s) > 0, and in addition 
that the error term is uniform on the closed half-plane, once we omit a 
neighborhood of the origin (say \s\ < 1). To see this, start with s > 0, 
and write 


r ⑷ 




e ~x-\-slogx 


dx 

X 


Upon making the change of variables x i—> sx, the above equals 



g—srr+s log sx 


dx 


e slogs e~ 


3 _ S 中⑷ 


dx 


where ^(x) = x — 1 — logx. By analytic continuation this identity con¬ 
tinues to hold, and we have when Re( 5 ) >0, 

r(s) =e slogs e- s 7(s) 

with 

I(s)= e~ slS， ^—. 

Jo x 


It now suffices to see that 
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Observe first that $(1) = $’(1) = 0, ^ ,r (x) = 1/x 2 > 0 whenever 

0 < x < oo, and $"(1) = 1. Thus $ is convex, attains its minimum at 
a; = 1, and is positive. 

We apply the complex version of the Laplace method, Proposition 2.2, 
in this situation. Here the critical point is xo = 1 and ^(x) = 1/x. 
We choose for convenience the interval [a, b] to be [1/2,2]. Then for 
fa dx we get the asymptotic (12). It remains to bound the 

error terms, those corresponding to integration over [0,1/2], and [2, oo). 
Here the device of integration by parts, which has served us so well, can 
be applied again. Indeed, since ^(x) = 1 — 1/x, we have 


nl/2 -s<t>{x) ^ 


0 —s^(x) 

X — 1 


1/2 


r*l/2 


■ 


dx 


(x-iy 


Noting that $(e) —>• +oo as e —> 0, and |e _s 少⑷ | < 1, we find in the limit 
that 



1/2 e — 释)竺二 
x 


2 - s $(l/2) _ I /* / Qr) 

S sj 0 (x-l)2 


Thus the left-hand side is 0(l/|s|) in the half-plane Re(s) > 0. 

The integral J 2 °° e _s$ ⑷夸 is treated analogously, once we note that 
J 2 °°(x — l) -2 dx converges. 

Since these estimates are uniform, (12) and thus (11) are proved for 
Re(5) > 0, \s\ —> cxd. 


To pass from Re(s) > 0 to Re (5) < 0, 5 G Ss^ we record the following 
fact about the principal branch of log s: whenever Re (5) > 0, 5 = + it, 

t 0, then 


log (- 5 ) 


log S — 17T 
log 5 + Z7T 


if t > 0, 
if t < 0. 


Hence if G(s) = e slogs e _s , Re(s) > 0, t 7 ^ 0, then 


(13) 

Next, 

(14) 


G(s)- 1 - I 


e s logs e -s e -sin 
gSlogSg-SgSZTT 


if t > 0, 
if t <0. 


r ⑷ r (— s )= 


7T 

—s sin ns 
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which follows from the fact that r(s)r(l — s) = 7r/sin7rs, and 
r(l — 5 ) = —sT(—s) (see Theorem 1.4 and Lemma 1.2 in Chapter 6). 
The combination of (13) and (14), together with the fact that for large 
5 ， (1 + 0(1/|5|" 2 )) 1 = 1 + 0(1/|5| 1 / 2 ), allows us then to extend (11) 
to the whole sector Ss, thereby completing the proof of the theorem. 


3 The Airy function 

The Airy function appeared first in optics, and more precisely, in the 
analysis of the intensity of light near a caustic; it was an important early 
instance in the study of asymptotics of integrals, and it continues to arise 
in a number of other problems. The Airy function Ai is defined by 

(15) Ai(s) = — /°° e i{x3/3+sx) dx, with sgM. 

2 丌 ■/-GO 


Let us first see that because of the rapid oscillations of the integrand as 
\x\ 00 , the integral converges and represents a continuous function of 

s. In fact, note that 

1 _ 色 (i(x 3 /3+sx)\ _ i(x 3 /3-\-sx) 

i(x 2 -\- s) dx V / 


so if a > 2I5! 1 / 2 , we can write the integral e z ( x3 / 3 + sx ) d x as 


(16) 


r R 1 d 
J a i(x 2 + s) dx 


^(x 3 /S+sx)^ 


dx. 


We may now integrate by parts and let R —> 00 , to see that the integral 
converges uniformly, and that as a result f a e l ^ x / 3+sx ) dx is also con¬ 
tinuous for |s| 幺 a 2 /4. The same argument works for the integral from 
—00 to —a and our assertion regarding Ai(«s) is established. 

A better insight into Ai( 5 ) is given by deforming the contour of inte¬ 
gration in (15). A choice of an optimal contour will appear below, but 
for now let us notice that as soon as we replace the x-axis of integration 
in (15) by the parallel line Ls = {x id, x G M}, 5 > 0, matters improve 
dramatically. 

In fact, we may apply the Cauchy theorem to f(z) = e l ^ z / 3+S2 ) over 
the rectangle 7 丑 shown in Figure 3. 

One observes that f(z) = 0(e~ Sx2 ) on L§ : while f(z) = 0(e~ yR2 ) on 
the vertical sides of the rectangle. Thus since e~ yR2 dy ^ 0 as 
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Ls 

z = x i5 


V l\ 


-R R 

Figure 3. The line Ls and the contour 7 丑 


R —> 00 , we see that 


Ai(s) = — / e i(23/3+S2) dz. 

2?r Jl s 

Now the majorization f(z) = 0(e~ Sx2 ) continues to hold for each com¬ 
plex 5 , and hence because of the (rapid) convergence of the integral, Ai(s) 
extends to an entire function of s. 

We note next that Ai(s) satisfies the differential equation 

(17) Ai"(5) = sAi(5). 

This simple and natural equation helps to explain the ubiquity of the 
Airy function. To prove (17) observe that 

Ai 〃 (s) — sAi(s)= 丄 /" (-Z 2 - s ) e i(^ 3 /3+^) dz _ 

2?r Jl 5 

But -(z 2 + s y( 々 3+ 拉)編 i 丢 (e 办 3 / 3+sz ))，so 

Ai 〃 (s) — sAi(s) = 士 / ^ ： (f(z))dz 二 0, 

since f(z) = e z ( z3 / 3 + sz ) vanishes as |z| —>• 00 along Ls. 

We now turn to our main problem, the asymptotics of Ai(«s) for large 
(real) values of 5. The differential equation (17) shows us that we may 
expect different behaviors of the Airy function when |s| is large, depend¬ 
ing on whether 5 is positive or negative. To see this, we compare the 
equation with a simple analogue 


(18) 


y 〃 (s) = Ay(s), 





330 


Appendix A: ASYMPTOTICS 


where A is a large constant, with A positive when considering s positive 
and A negative in the other case. The solutions of (18) are of course e^ s 
and e~^ s , the first growing rapidly, and the second decreasing rapidly 
as s oo, if ^4 > 0. A glance at the integration by parts following (16) 
shows that Ai(5) remains bounded when s —>• oo. So the comparison 
with e v ^ s must be dismissed, and we might reasonably guess that Ai(s) 
is rapidly decreasing in this case. When 5 < 0 we take A < 0 in (18). The 
exponentials e^ s and e -v ^ s are now oscillating, and we can therefore 
presume that Ai(s) should have an oscillatory character as s —oo. 


Theorem 3.1 Suppose u > 0. Then as u — oo, 

(i) Ai(—u)= 丌 _1 / 2 乜 _ " 4 cos (誉 ii 3 / 2 — f)(1 + 0(1/ 乜 3 / 4 )). 

(ii) Ai ㈦ 二 ^ J1 u- 1 / A e-i u3/ \l + 0(l/u 3 / 4 )). 

To consider the first case, we make the change of variables x i—> 
in the defining integral with s = —u. This gives 


where 

(19) 


Now write 


Ai(—zi) (w 3/2 )， 


/―⑴二 



dx. 


J— ㈤ 


2tt 



dx, 


where $(x) = $_(x) = x 3 /3 — x, and we shall apply Proposition 2.2, 
which in this case, since s is purely imaginary, is the method of stationary 
phase. Note that = x 2 — 1, so there are two critical points, xo = 

士 1; observe that ^ ,r (x) = 2x; also $ (士 1)= 干 2/3. 

We break up the range of integration in (19) into two intervals [—2,0] 
and [0,2] each containing one critical point, and two complementary 
integrals, (—oo, —2] and [2,oo). 

Now we apply Proposition 2.2 to the interval [0,2] with s = —it^ xq = 1 
^ = 1/27T, $(1) = —2/3, $"(1) = 2, and get a contribution of 

—-— ( --- 1- O 

2^/tt L (- 以 ) V 2 
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in view of (8). Similarly the integral over [—2,0] contributes 

^ ei§t ((^ + °(w))' 

Finally, consider the complementary integrals. The first is 

f e 柳 ㈤ dc 二 lim / e 财 ㈤ dec - lim - / 4 - (e 财⑷） 

7-00 N ^J-N N ^°° J-N dx ^ W 

where ^ f (x) = x 2 — 1. So an integration by parts shows that this is 
0(l/|t|). The integral over [2,oo) is treated similarly. Adding these four 
contributions, and inserting them in the identity Ai(—w) = w 1//2 /_(ii 3 / 2 ), 
proves conclusion (i) of the theorem. 6 

To deal with the conclusion (ii) of the theorem, we make the change 
of variables x i—> in the integral (15), with s = u. This gives us, for 

u > 0, 

Ai(u) = u 1/2 I + (u 3/2 ), 

where 

(20) 1+{s) ^ 1. J°° e sF( X ) dx 

and F{x) = —i(x 3 /3 + x). Now when s ^ oo, the integrand in (20) again 
oscillates rapidly, but here in distinction to the previous case, there is no 
critical point on the real axis, since the derivative of x 3 /3 + x does not 
vanish. A repeated integration by parts argument (such as we have used 
before) shows that actually the integral /+(s) has fast decay as 5 —oo. 
But what is the exact nature and order of this decrease? To answer this 
question, we would have to take into account the precise cancellations 
inherent in (20), and doing this by the above method does not seem 
feasible. 

A better way is to follow the guiding principle used in the asymptotics 
of the Bessel function, and to deform the line of integration in (20) to a 
contour on which the imaginary part of F(z) vanishes; having done this, 
one might then hope to apply Laplace’s method, Proposition 2.1, to find 
the true asymptotic behavior of as s ^ oo. 

We describe the idea in the more general situation in which we assume 
only that F(z) is holomorphic. To follow the approach suggested, we 
seek a contour T so that: 


6 An alternative derivation of this conclusion can be given as a consequence of the 
relation of the Airy function with the Bessel functions. See Problem 3 below. 
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(a) lm(F) = 0 on r. 

(b) He(F) has a minimum on T at some point 之 o, and this function 
is non-degenerate in the sense that the second derivative of Re(F) 
along r is strictly positive at zq. 

Conditions (a) and (b) imply of course that i 71 ’ ( 之 0 ) = 0. If as above, 
F ff (zo) 7 ^ 0, then there are two curves Ti and r? passing through Zq 
which are orthogonal, so that F|p. is real for i = 1,2, with Re(F) re¬ 
stricted to Ti having a minimum at zq ； and Re(F) restricted to r] hav¬ 
ing a maximum at zq (see Exercise 2 in Chapter 8). We therefore try to 
deform our original contour of integration to T = . This approach is 

usually referred to as the method of steepest descent, because at Zo 
the function —Re(F(z)) has a saddle point, and starting at this point and 
following the path of Ti, one has the greatest decrease of this function. 

Let us return to our special case, F(z) = —i(z 3 /3 + z). We note that 

f Re(F) = x 2 y - y 3 /3 + y, 

I lm(F) = —x s /3 + xy 2 — x. 


We observe also that F’ （ z) = —i(z 2 + 1), so we have two non-real critical 
points Zo = =bi at which F^zq) = 0. If we choose zo = then the two 
curves passing through this point where lm(F) = 0 are 


ri = {{x,y) : y 2 = x 2 /3 + 1} and r 2 = {(x,y) : x = 0}. 

On 『 2 , the function Re(_F) clearly has a maximum at the point 2：o = 
and so we reject this curve. We choose r = Ti, which is a branch of a hy¬ 
perbola, and which can be written as y = (x 2 /3 + l) 1 ’ 2 ; it is asymptotic 
to the rays z = r*e Z7r / 6 , and z = r*e* 57r / 6 at infinity. See Figure 4. 



Figure 4. The curve of steepest descent 
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Next, we see that 


( 21 ) 


2tt 


e~ sF ^ dx 


2tt 


e_ sF(z ) dz. 


This identity is justified by applying the Cauchy theorem to e~ sF ^ 
on the contour that consists of four arc segments: the parts of the 
real axis and T that lie inside the circle of radius i?, and the two arcs 
of this circle joining the axis with T. Since in this region e~ sF ^ = 
0(e~ cyx ) as a: —>• 士 oo, the contributions of the two arcs of the circle are 
0( e ~ cR2sin0 d6) = 0(1/R)^ and letting i? —oo establishes ( 21 ). 

We now observe that on T 


$(x) = Re(F) = y{x 2 — y 2 /3 + 1) 


l^ + ^xVs+i) 1 / 2 , 


since y 2 = x 2 /3 + 1 there. Also, on T we have that dz = dx -\- idy 


dx + z(x/3)(x 2 /3 + V)~ x ! 2 dx. Thus, 


( 22 ) 


2 tt 


e~ sF ^ dz 


2 tt 


e~ s ^ dx, 


in view of the fact that $($) is even, while x(x 2 /3 + 1) _ 1 〆 2 is odd. 

We note next that since (1 + u) 1 / 2 = 1 + u/2 + 0(u 2 ) as it ^ 0, 

t /、 / 8 2 2 、 2 1 / 4 、 2 2 . 4 、 

$(x) = (-x + 3 )+32Y + 0 ^ + 3 + °( x )' 

and so $"(0) = 2. We now apply Proposition 2.1 to estimate the main 
part of the right-hand side of ( 22 ), by 


2 tt 


e~ s ^ dx, 


where c is a small positive constant. Since $(0) = 2/3, $"(0) = 2, and 
0(0) = 1/27T, we obtain that this term contributes 


e~i £ 


1 1 

27T 1 / 2 S 1 / 2 


The term J c °° e~ s ^^ dx is dominated by e - 2s / 3 e _ClSa;2 ckc, which is 
0(e _ 2 s / 3 e _5s ) for some 5 > 0, as soon as c > 0. A similar estimate holds 
for e ~ s ^( x ) dx. Altogether, then, 


I + {s) = e~ 


2^77^ + 0 U 


as s 


00 , 


and this gives the desired asymptotic (ii) for the Airy function. 
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4 The partition function 


Our last illustration of the techniques developed in this appendix is in 
their application to the partition function p(n), which was discussed in 
Chapter 10. We derive for it the main term of the remarkable asymptotic 
formula of Hardy-Ramanujan. 


Theorem 4.1 If p denotes the partition function, then 


(i) p ㈤ 〜 


1 

4V3n 


as n oo, where K = tt 



(ii) A much more precise assertion is that 


P(n)= 


_ d_ 

27T\/2 dn 


e X ( n — ^) 1 / 2 \ 


+ 0(e^' 


r 1 / 2 、 


Note. Observe that (n — 去 )" 2 — n" 2 = 0(n _1 / 2 ), by the mean- 
value theorem; hence 2 = e Knl/2 (1 + 0(n _1//2 )), thus 

e K( n - 去 ) 1/2 〜 e Knl/2 , as n —> oo. Of course, clearly (n — ^) 1 / 2 ~ n" 2 , 
and in particular (ii) implies (i). 


We shall discuss first, in a more general setting, how we might derive 
the asymptotic behavior of a sequence {F n } from the analytic properties 
of its generating function F(w) = F n w n . Assuming for the sake of 

simplicity that F n w n has the unit disc as its disc of convergence, we 
can set forth the following heuristic principle: the asymptotic behavior 
of F n is determined by the location and nature of the “singularities” of 
F on the unit circle, and the contribution to the asymptotic formula 
due to each singularity corresponds in magnitude to the “order” of that 
singularity. 

A very simple example in which this principle is unambiguous and can 
be verified occurs when F is meromorphic in a larger disc, but has only 
one singularity on the circle, a pole of order r at the point w = 1. Then 
there is a polynomial P of degree r — 1 so that F n = P(n) + 0(e _en ) as 
n — ^ cxd, for some e > 0. In fact, P{^)^ n is a good approximation 

to F{w) near ^ = 1; it is the principal part of the pole of F. (See also 
Problem 4.) 

For the partition function the analysis is not as simple as this ex¬ 
ample, but the principle stated above is still applicable when properly 
interpreted. To this task we now turn. 


We recall the formula 


oo oo 

^2p(n)w n = J 
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established in Theorem 2.1, Chapter 10. This identity implies that the 
generating function is holomorphic in the unit disc. In what follows, it 
will be convenient to pass from the unit disc to the upper half-plane by 
writing w = e 27Tlz , z = x + iy, and taking y > 0. We therefore have 

oo 

J>(n)e 2 —= / ⑷， 

n=0 


with 

oo 丄 

/(-) = IJ x _ e 2ninz ' 

n=l 

and 

(23) P(n)= / f(z)e~^ inz dz. 

Here 7 is the segment in the upper half-plane joining —1/2-\-i5 to 
1/2 + id, with 5 > 0; the height 5 will be fixed later in terms of n. 

To proceed further, we look first at where the main contribution to 
the integral (23) might be, in terms of the relative size of f(x iy), as 
y —>• 0. Notice that f is largest near z = 0. This is because 
\f(x iy)\ < f (iy) ， and moreover f(iy) increases as y decreases, in view 
of the fact that the coefficients p(n) are positive. Alternatively, we ob¬ 
serve that each factor 1 — e 2?rm2： , appearing in the product for /, vanishes 
as z —> 0, but the same is true for any other point (mod 1) on the real 
axis. Thus in analogy with the simple example considered above, we seek 
an elementary function /1, which has much the same behavior as / at 
2: = 0, and try to replace / by /1 in (23). 

It is here that we are very fortunate, because the generating function 
is just a variant of the Dedekind eta function, 

00 

v (z) ^ e inz/12 Y[(l - e 2ninz ). 

n=l 

From this, it is obvious that 

(Incidentally, the fraction 1/12 arising above will explain the occurrence 
of the fraction 1/24 in the asymptotic formula for p(n).) 
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Since rj satisfies the functional equation rj(—l/z) = y/z/i rj(z) (see 
Proposition 1.9 in Chapter 10), it follows that 

(24) f(z)^^7/ie^e^f(-l/z). 

Notice also that if z is appropriately restricted and z —^ 0, then 
Im(—l/z) —>• oo, from which it follows that f(—l/z) —> 1 rapidly, because 

(25) f(z) = 1 + 0(e~ 27ry ), z = y >1. 

Thus it is natural to choose fi(z) = \fzfi e^e 1 ^ as the function which 
approximates well the generating function f(z) (at z = 0), and write 
(because of (24)) 

Pin) »pi(n) +£(n), 

with 

' Pl (n ) 二 f ^Tie^e^e-^ inz dz, 


We first take care of the error term E(n), and in doing so we specify 
7 by choosing its height in terms of n. In estimating E{n) we replace its 
integrand by its absolute value and note that if z G 7 , then 

(26) y/zji e^e^e~ 2lvinz < ce 2 ?rn 5 e*^+^, 

since z = x + iy, and Re (i/ 2 ：) = 5/(5 2 x 2 ). 

On the other hand, we can make two estimates for f(—l/z) — 1. The 
first arises from (25) by replacing z by — l / 么 , and gives 

(27) \f(-l/z)-l\<ce~ 27T ^ if ^ > 1. 


For the second, we observe that \f(z)\ < f(iy) < Ce 1 ^, when y < 1, 
because of the functional equation (24), and hence 

(28) |/(— 1 / 2 ) — 1 | £ O = O (e*) 

if j 2 i ^2 < 1 , since \x\ < 1 / 2 . 
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Therefore in the integral defining E{n) we use (26) and (27) when 
> 1, and (26) and (28) when < 1. The first leads to a con¬ 

tribution of 0(e 2?rn5 ), since 2 丌 > 7 r/ 12 . The second gives a contribution 
of 0(e 27 rn 5 e 击 ). Hence E(n) = 0(e 2 ?rn 5 e 击 ), and we choose 5 so as to 
minimize the right-hand side, that is, 27rn5 = this means we take 
5 and we get 

E(n) 二 0 ( 為 ’） = 0 (e * n " 2 ) ， 
which is the desired size of the error term. 


We turn to the main term pi(n). To simplify later calculations we 
“improve” the contour 7 by adding to it two small end-segments; these 
are the segment joining —1/2 to —1/2 + i5 and that joining 1/2-\-iS to 
1/2. We call this new contour 7 ’ (see Figure 5). 

7 




…卜 

- 1/2 


1/2 


7’ 



- 1/2 1/2 
Figure 5. 7 and the improved contour 7 ’ 


Notice that since \fzfi e 浩 is 0(1) on the two added segments (for 
the integral defining pi(n)), the modification contributes 0(e 2nnS )= 

0 (e 47 e ) = 0(e~^ n ), which is even smaller than the allowed error, 

and therefore can be incorporated in E{n). So without introducing fur¬ 
ther notation we will rewrite pi(n) replacing the contour 7 by ^ in the 
integration defining pi , namely 

(29) Pl ( n ) 二 f e^e^e -2 ™ 2 dz. 
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Next we simplify the triad of exponentials appearing in (29) by making 
a change of variables z fiz so that their combination takes the form 

e« 


This can be achieved under the two conditions A = 2n— 去 ) and 
A = which means that 


A 


7T 


7 i (n '^ )1/2 


and fi 


» 2 . 


Making the indicated change of variables we now have 

(30) Pl (n) -… 2 j e- sF ( z ) s/JJi dz, 

with F(z) = i(z — l/z), s = -^=(n— 去 ）" 2 . The curve T (see Figure 6). 
is now the union of three segments [—a n , —a n + i5 f ], [—a n + iS\ a n + i5 f ], 
and [a n + iS’, a n ]; we can write T = 

r 


—a n + i5 f a n + i5 r 




—a n 

dn 



Figure 6. The curve T 

Here 


1 = \/6 (n — ^) 1 / 2 « n 1 / 2 , while 5 r = 5[i~~ 


丄丄 cic Ujffi — 2 b 1 — V ^ \ IC 24 / 

4^/2 ( n — ^) 1/2 - V 2 , as n ^ oo. 


We apply the method of steepest descent to the integral (30). In doing 
this, we note that F(z) = i(z — 1/z) has one (complex) critical point 
z = m the upper half-plane. Moreover, the two curves passing through 
i on which F is real are: the imaginary axis, on which F has a maximum 
at z = i, which we reject, and the unit circle, on which F has a minimum 
at z = i. Thus using Cauchy 5 s theorem we replace the integration on T 
by the integration over our final curve r*, which consists of the segment 
[—a n , —1], [1, a n ], together with the upper semicircle joining — 1 to 1. 
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r 



Figure 7. The final curve r* 


We therefore have 


Pl (n) - /i 3 / 2 / e~ sF ^y/JJi dz. 

Jr* 

The contributions on the segments [—a n , —1] and [1, a n ] are relatively 
very small, because on the real axis the exponential has absolute value 
1 , and hence the integrand is bounded by sup| z | <an ㈤" 2 , and this leads 

to two terms which are 0(a^/ 2 〆/ 2 ) = 0(1). 

Finally, we come to the principal part, which is the integration over the 
semicircle, taken with the orientation on the figure. Here we write 2 ：= 
e lG , dz = ie l ° dO. Since i(z — l/z) = —2sin0, this gives a contribution 

e 2ssin6 e i3e/2 Vid9 = fi 3 / 2 [ ' e 2sGOS0 (cos(30/2) + zsin(30/2))^. 
JO J-n/2 

In applying Proposition 2.1, Laplace’s method, we take = — cos0, 
6o = 0, so $(0o) = _1, 少 "( 沒 o) = 1 and we choose ^(0) = cos(30/2) + 
isin(30/2), so that ^(Oo) = 1. Therefore, the above contributes 

， /2 #蟲(1 跨 1/2 )). 

Now since s = -^=(n- 去 )" 2 , 赛 =and/x = 曾 (n — 去 )— 工 ’ 2 , 
we obtain 

参 ( 1 + 0 卜 _1/4 ))， 
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and the first conclusion of the theorem is established. 


To obtain the more exact conclusion (ii), we retrace our steps and use 
an additional device, which allows us to evaluate rather precisely the key 
integral. With pi(n) defined by (29), which is an integral taken over 
y = 7“，we write 


Pi(n) 


d 

dn 


q(n) + e(n), 


where 


g(n) 


2tt 


(^A) 


- 1/2 


e^~ e 


-27rinz 


dz, 


and e(n) is the term due to the variation of the contour y = 7 “，when 
forming the derivative in n. By Cauchy 5 s theorem this is easily seen to 
be dominated by 0 (e 2?rn5 ), which we have seen is 0 (e^ n ), and can be 
subsumed in the error term. To analyze q(n), we proceed as before, first 
making the change of variables z ㈠ pz, and then replacing the resulting 
contour r by T*. As a consequence, we have 

,, 1/2 r 

(31) q(n) = L e-^iz/iy^dz, 

with F(z ) 二 i(z — l/z), s = 为 （n — ^) 1/2 , and /x = $ (n — 忐 )— 1/2 . 

Now the two segments [—a n , —1] and [1, a n ] of the contour T* make 
harmless contributions to since F is purely imaginary on the real 

axis. Indeed, they yield terms which are 0(al/ 2 /j, 1 ^ 2 ) = 0(1). 

The main part of (31) is the term arising from the integration on the 
semicircle. Thus setting z = 〆 ,dz = ie l ° dB, and i(z — 1/z) = —2sin0, 
it equals 

1 / 2 广丌 /2 

e 2 ssine e ie/ 2^/2 d0 = 巳 _ / e 2scose (cos(0/2) ism{6/2))dO 

2 丌 J-tt/2 
1/2 r ^ r /2 

=e 2scos6 cos(e/2)de, 

2 丌 J-tt/2 

where we have used the fact that the integral e 2scos0 sin( 0 / 2 ) d6 

vanishes since the integrand is odd. 
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Now cos6 = 1 — 2(sin0/2) 2 , so setting x = sin(0/2) we see that the 
above integral becomes 

1/2 2s 

/ e~ isx dx. 


However 


4 

_ 


e~ 4sx dx 


_ %/2 


e~ 4sx2 dx + O 




e~ 4sx dx 


2sV2 


+ 0{e~ 2s ) 


and also 


d_ 

ds 




e~ 4sx dx 


d ( y/n 


ds 

Gathering all the error terms together, we find 

0 (e 


p(n) = ^(^ 1 / 2 v ^ 


0{e~ 2s ). 


知 1/2 、 


Since 5 = ^(n — ^) 1 / 2 , fi = ^(n — ^) -1 / 2 , and K = 7rW^|, this is 


P(n) 


d ( e x ( n ~A) 1/2 


27rV2 dn 、 (n - 去 )" 2 

and the theorem is proved. 


+ 0(el 


r 1 / 2 、 


5 Problems 

1. Let r] be an indefinitely differentiable function supported in a finite interval, so 
that r)(x) = 1 for x near 0. Then, if m > —1 and N > 0, 

[e~ sx x m rj(x) dx = s~ rn ~ 1 T(m + 1) + 0(s— N ) 

Jo 

for Re(s) > 0, |s| ^ oo. 

(a) Consider first the case —1 < m < 0. It suffices to see that 
f e~ sx x rn (l — r](x)) dx = 0(s~ N ), 

Jo 

and this can be done by repeated integration by parts since 

e— = (-1)^ 5 -^(£)^( 6 — ). 
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(b) To extend this to all m, find an integer k so that k — 1 < m < k, write 

e~ sx r)(x) dx = c k ,mS~ rn+k ~ 1 + 0(s_ N ), 
and integrate by parts k times. 



2. The following is a more precise version of Stirling’s formula. There are real 
constant a\ = 1/12, 处， ... ， a n , …， so that for every iV > 0 


r ⑷ 


s log s — 

e e 


; \Z27T 


1 + [^S- J +0(s-w) 


when s E Ss. 


This can be proved by using the results of Problem 1 in place of Proposition 1.3. 


3. The Bessel functions and Airy function have the following power series expan¬ 
sions: 


oo (—l) rn ( ) 

厶 ㈤ =( 昏 ） E m \T{u + m + l) 

771=0 


Ai (- a; )=-Zl 




sin(27r(n + l)/3)3™ /3_2/3 r(n/3 + 1/3). 


(a) From this, verify that when a; > 0, 


/ 

Ai(—a:) = —(«/i/3 



+ J-l/3 



(b) The function Ai(x) extends to an entire function of order 3/2. 

[Hint: For (b), use (a), or alternatively, apply Problem 4 in Chapter 5 to the power 
series for Ai. Compare also with Problem 1, Chapter 4.] 


4. Suppose F(z) = F n w n is meromorphic in a region containing the closed 

unit disc, and the only poles of F are on the unit circle at the points ai,..., a/c, 
and their orders are ri,..., r/c respectively. Then for some e > 0 

k 

F n = Pj(n) + 0(e~ en ) as n — oo. 


Here 


= 


1 


iTj - 1 )! 



r 3~ 1 


[(w — aj) rj w~ n ~ 1 F(w)\ 
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Note that each Pj is of the form Pj{n) = Aj(aJ 1 n) rj ~ 1 + 0(n rj ~ 2 ). 

To prove this, use the residue formula (Theorem 1.4, Chapter 3). 

5.* The one shortcoming in our derivation of the asymptotic formula for p(n) arose 
from the fact that while fi(z) = \Jz/i e~^~ is a good approximation to the 

generating function f(z) near z = 0, this fails near other points on the real axis, 
since /i is regular there, but / is not. 

However, using the transformation law (24) and the identity f(z + 1) = f(z), 
one can derive the following generalization of (24): whenever p/q is a rational 
number in lowest form (so p and q are relatively prime) then 



where pp’ = 1 mod q. Here oj p / q is an appropriate 24 th root of unity. This formula 
leads to an analogous f p / q , approximating / at z = p/q. 

From this one can obtain for each p/q a contribution of the form 

1 d ( e f< n ~A) 1/2 \ 

Cp/q 2nV2 dn ( ( n — 去 ) 1/2 J 

to the asymptotic formula for p(n). When suitably modified, the resulting series, 
summed over all proper fractions p/q in [0,1), actually converges and gives an 
exact formula for p(n). 





Appendix B: Simple Connectivity 
and Jordan Curve Theorem 


Jordan was one of the precursors of the theory of func¬ 
tions of a real variable. He introduced in this part of 
analysis the capital notion of functions of bounded 
variation. Not less celebrated is his study of curves, 
universally called Jordan curves, which curves sepa¬ 
rate the plane in two distinct regions. We also owe 
him important propositions regarding the measure of 
sets that have led the way to numerous modern re¬ 
searches. 

E. Picard, 1922 


The notion of simple connectivity is at the source of many basic and 
fundamental results in complex analysis. To clarify the meaning of this 
important concept, we have gathered in this appendix some further in¬ 
sights into the properties of simply connected sets. Closely tied to the 
idea of simple connectivity is the notion of the “interior” of a simple 
closed curve. The theorem of Jordan states that this interior is well- 
defined and is simply connected. We prove here the special case of this 
theorem for curves which are piecewise-smooth. 

Recall the definition in Chapter 3, according to which a region f] is 
simply connected if any two curves in f] with the same end-points are 
homotopic. From this definition we deduced an important version of 
Cauchy’s theorem which states that if is simply connected and 7 C 
is any closed curve, then 



whenever / is holomorphic in f2. Here, we shall prove that a converse 
also holds, therefore: 

(I) A region f] is simply connected if and only if it is holomorphi- 
cally simply connected; that is, whenever 7 C O is closed and / 
holomorphic in f] then (1) holds. 


Besides this fundamental equivalence, which is analytic in nature, there 
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are also topological conditions that can be used to describe simple con¬ 
nectivity. In fact, the definition in terms of homotopies suggests that a 
simply connected set has no “holes.” In other words, one cannot find a 
closed curve in f] that loops around points that do not belong to f]. In the 
first part of this appendix we shall also turn these intuitive statements 
into tangible theorems: 

(II) We show that a bounded region f] is simply connected if and only 
if its complement is connected. 

(Ill) We define the winding number of a curve around a point, and prove 
that f] is simply connected if and only if no curve in f] winds around 
points in the complement of f]. 

In the second part of this appendix we return to the problem of curves 
and their interior. The main question is the following: given a closed 
curve r that does not intersect itself (it is simple), can we make sense 
of the “region enclosed by P 5 ? In other words, what is the “interior” of 
r? Naturally, we may expect the interior to be open, bounded, simply 
connected, and have T as its boundary. To solve this problem, at least 
when the curve is piecewise-smooth, we prove a theorem that guarantees 
the existence of a unique set which satisfies all the desired properties. 
This is a special case of the Jordan curve theorem, which is valid in the 
general case when the simple curve is assumed to be merely continuous. 
In particular, our result leads to a generalization of Cauchy’s theorem in 
Chapter 2 which we formulated for toy contours. 

We continue to follow the convention set in Chapter 1 by using the term 
“curve” synonymously with “piecewise-smooth curve,” unless stated oth¬ 
erwise. 

1 Equivalent formulations of simple connectivity 

We first dispose of (I). 

Theorem 1.1 A region f] is holomorphically simply connected if and 
only if f] is simply connected. 

Proof. One direction is simply the version of Cauchy’s theorem in 
Corollary 5.3, Chapter 3. Conversely, suppose that is holomorphically 
simply connected. If f] = C, then it is clearly simply connected. If 
is not all of C, recall that the Riemann mapping theorem still applies 
(see the remark following its proof in Chapter 8), hence is conformally 
equivalent to the unit disc. Since the unit disc is simply connected, the 
same must be true of 
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Next, we turn to (II) and (III), which, as we mentioned, are both 
precise formulations of the fact that a simply connected region cannot 
have “holes.” 

Theorem 1.2 If Q. is a bounded region in C, then f] is simply connected 
if and only if the complement of^l is connected. 

Note that we assume that Cl is bounded. If this is not the case, then the 
theorem as stated does not hold, for example an infinite strip is simply 
connected yet its complement consists of two components. However, if 
the complement is taken with respect to the extended complex plane, 
that is, the Riemann sphere, then the conclusion of the theorem holds 
regardless of whether is bounded or not. 

Proof. We begin with the proof that if f2 c is connected, then f] is 
simply connected. This will be achieved by showing that is holomor- 
phically simply connected. Therefore, let 7 be a closed curve in f] and / 
a holomorphic function on f]. Since is bounded, the set 1 

K = {zen ： d(z,n c ) > e} 

is compact, and for sufficiently small e, the set K contains 7 . In an 
attempt to apply Runge’s theorem (Theorem 5.7 in Chapter 2), we must 
first show that the complement K c of K is connected. 

If this is not the case, then K c is the disjoint union of two non-empty 
open sets, say K c = OiU O 2 . Let 

F 1 = 0 1 r\ n c and F 2 = 0 2 D n c . 

Clearly, f^ c = Fi U F 2 , so if we can show that Fi and F 2 are disjoint, 
closed, and non-empty, then we will conclude that Q c is not connected, 
thus contradicting the hypothesis in the theorem. Since Oi and O2 are 
disjoint, so are Fi and F 2 . To see why F\ is closed, suppose {^ n } is a 
sequence of points in F\ that converges to 2 ：. Since f2 c is closed we must 
have 2 : G f2 c , and since is at a finite distance from K ， we deduce that 
z E Oil) 02- Now we observe that we cannot have ^ G O 2 , for otherwise 
we would have z n G O2 for sufficiently large n because O2 is open, and 
this contradicts the fact that z n G F\ and (9i fl O 2 = 0. Hence z G Oi 
and F\ is closed, as desired. Finally, we claim that F\ is non-empty. 
If otherwise, 0 \ is contained in Cl. Select any point w E Oi^ and since 
w 丰 K, there exists z ^ Q c with \w — z\ < e, and the entire line segment 
from w to z belongs to K c . Since z E O2 (because 0 \ C il), some point 


iHere, d(z, fl c ) = inf{|2 — w\ : w E D c } denotes the distance from 2: to fl c . 
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on the line segment [z, w] must belong to neither 0 \ nor O2, and this is 
a contradiction. More precisely, if we set 


t* = sup{0 < t < 1 : (1 — t)z tw e O2} 


then 0 < < 1, and the point (1 — t*)z + which is not in K, cannot 

belong to either Oi or O2 since these sets are open. Similar arguments 
imply the same conclusions for 巧 ， and we have reached the desired 
contradiction. Thus K c is connected. 

Therefore, Runge’s theorem guarantees that / can be approximated 
uniformly on K, and hence on 7, by polynomials. However , 人 jP(z) dz = 
0 whenever P is a polynomial, so in the limit we conclude that f(z) dz = 
0, as desired. 

The converse result, that f] c is connected whenever f] is bounded and 
simply connected, will follow from the notion of winding numbers, which 
we discuss next. 

Winding numbers 

If 7 is a closed curve in C and 2 ： a point not lying on 7, then we may 
calculate the number of times the curve 7 winds around 2 ： by looking at 
the change of argument of the quantity C — 之 as C travels on 7. Every time 
7 loops around z, the quantity (l/ 27 r) arg(C — z) increases (or decreases) 
by 1 . If we recall that logw = log \w\ + i argil?, and denote the beginning 
and ending points of 7 by Ci and ^2? then we may guess that the quantity 



computes precisely the number of times 7 loops around 

These considerations lead to the following precise definition: the wind¬ 
ing number of a closed curve 7 around a point z _ 7 is 



Sometimes, W 1 {z) is also called the index of 2 ： with respect to 7. 

For example, if 7(t) = e lkt , 0 < t < 2 丌 ， is the unit circle traversed k 
times in the positive direction (with k G N), then W 7 ( 0 ) = k. In fact, 
one has 



if |z| < 1 ， 

if |2：j > 1. 
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Similarly, if 7 (t) = e _zfct , 0 <t < 2 丌 ， is the unit circle traversed k times 
in the negative direction, then we find that W 7 (z) = —k in the interior 
of the disc, and W^(z) = 0 in its exterior. 

Note that, if 7 denotes a positively oriented toy contour, then 


1 if z G interior of 7 , 
0 if z E exterior of 7 . 


W^z) 


In general we have the following natural facts about winding numbers. 
Lemma 1.3 Let j be a closed curve in C. 

(i) If z ^ then W^(z) G Z. 

(ii) If z and w belong to the same open connected component in the 
complement 0 / 7 ， then W^(z) = W 1 (w). 

(iii) If z belongs to the unbounded connected component in the comple¬ 
ment 0 / 7 ， then W^{z) = 0 . 

Proof. To see why (i) is true, suppose that 7 : [0,1] —> C is a parametriza- 
tion for the curve, and let 



Then G is continuous and, except possibly at finitely many points, it 
is differentiable with G r (t) = 7 ’ ⑷ /( 7 ⑴— z). This implies that, except 
possibly at finitely many points, the derivative of the continuous function 
H{t) = ( 7 (t) — z)e~ G ^ is zero, and hence H must be constant. Putting 
t = 0 and recalling that 7 is closed, so that 7 ( 0 ) = 7 ( 1 )，we find 


1 = e G ( 0 ) = 0 ( 7 ( 0 ) — z) = c( 7 (l) — z) = e G ^ 


Therefore, G(l) is an integral multiple of 27ri, as desired. 

For (ii), we simply note that W^(z) is a continuous function of 2 ： ^ 7 
that is integer-valued, so it must be constant in any open connected 
component in the complement of 7 . 

Finally, one observes that lim^i^oo W y (z) = 0, and, combined with (ii), 
this establishes (iii). 

We now show that the notion of a bounded simply connected set 
may be understood in the following sense: no curve in f] winds around 
points in f] c . 
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Theorem 1.4 A bounded region f] is simply connected if and only if 
W 7 (z) = 0 for any closed curve 7 m f] and any point z not in f2. 

Proof. If f] is simply connected and z 丰 then f(() = 1/(( — z) is 
holomorphic in fi, and Cauchy’s theorem gives W 1 {z) = 0. 

For the converse, it suffices to prove that the complement of f] is 
connected (Theorem 1.2). We argue by contradiction, and construct an 
explicit closed curve 7 in f] and find a point w so that W 1 (w) 7 ^ 0. 

If we suppose that Q c is not connected, then we may write 
f^ c = Fi U F 2 where Fi, F 2 are disjoint, closed, and non-empty. Only 
one of these sets can be unbounded, so that we may assume that F\ is 
bounded, thus compact. The curve 7 will be constructed as part of the 
boundary of an appropriate union of squares. 

Lemma 1.5 Let w be any point in F\. Under the above assumptions, 
there exists a finite collection of closed squares Q = {Qi, • •., Qn} that 
belong to a uniform grid Q of the plane, and are such that: 

(i) w belongs to the interior of Q\. 

(ii) The interiors of Qj and Qk are disjoint when j ^ k. 

(iii) Fi is contained in the interior of UJ=i Qj . 

(iv) Uj=i Qj disjoint from F 2 . 

(v) The boundary o/(JJ =1 Qj lies entirely in and consists of a finite 
union of disjoint simple closed polygonal curves. 

Assuming this lemma for now, we may easily finish the proof of the 
theorem. The boundary dQj of each square is equipped with the positive 
orientation. Since w G Qi, and w ^ Qj for all j > 1 , we have 



If 71 ,, 7 m denotes the polygonal curves in (v) of the lemma, then, the 
cancellations arising from integrating over the same side but in opposite 
directions in ( 2 ) yield 



and hence W ljQ (w) ^ 0 for some jo. The closed curve 7 )。lies entirely in 
f], and this gives the desired contradiction. 
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Proof of the lemma. Since F 2 is closed, the sets F\ and F 2 are at a 
finite non-zero distance d from one another. Now consider a uniform grid 
Go of the plane consisting of closed squares of side length which is much 
smaller than d, say < d/100, and such that w lies at the center of a closed 
square R\ in this grid. Let TZ = {i?i,..., i? m } denote the finite collection 
of all closed squares in the grid that intersect F\. Then, the collection TZ 
satisfies properties (i) through (iv) of the lemma. To guarantee (v), we 
argue as follows. 

The boundary of each square in TZ is given the positive (counterclock¬ 
wise) orientation. The boundary of UjLi ls then equal to the union 
of all boundary sides, that is, those sides that do not belong to two ad¬ 
jacent squares in the collection TZ. Similarly, the boundary vertices are 
the end-points of all boundary sides. A boundary vertex is said to be 
“bad,” if it is the end-point of more than two boundary sides. (See point 
P on Figure 1.) 


Qo 






















p 























Q 



Figure 1. Eliminating bad boundary vertices 


To eliminate the bad boundary vertices, we refine the grid Qq and 
possibly add some squares. More precisely, consider the grid Q obtained 
as a refinement of the original grid, by dissecting all squares of Qq into 
nine equal subsquares. Then, let Qi, … ，denote all the squares in the 
grid Q that are subsquares of squares in the collection TZ (so in particular, 
p = 9n), and where Q\ is chosen so that w G Q\. Then, we may add 
finitely many squares from Q near each bad boundary vertex, so that the 
resulting family Q = {Qi，..., Q n } has no bad boundary vertices. (See 
Figure 1.) 
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Clearly, Q still satisfies (i) through (iv), and we claim this collec¬ 
tion also satisfies (v). Indeed, let [ai,a 2 ] denote any boundary side of 
|JJ =1 Qj with its orientation from ai to a 2 . By considering the three 
different possibilities, one sees that 叱 is the beginning point of another 
boundary side [a 2 , as]. Continuing in this fashion, we obtain a sequence 
of boundary sides [ai, 叱 ],[ 叱，奶 ] ， … ， [a n , a n +i], — Since there are only 
finitely many sides, we must have a n = a m for some n and some m > n. 
We may choose the smallest m so that a n = a m , say m = m!. Then, 
we note that if n > 1 , then a m / is an end-point of at least three bound¬ 
ary sides, namely [a n _i,a n ], [a n ,a n +i], and a m /], hence a m / is 

a bad boundary vertex. Since we arranged that Q had no such bound¬ 
ary vertices, we conclude that n = 1 , and hence the polygon formed by 
di, … ,a m / is closed and simple. We may repeat this process and find 
that Q satisfies property (v), and the proof of Lemma 1.5 is complete. 

Finally, we are now able to finish the proof of Theorem 1.2, namely, 
if is bounded and simply connected we can conclude that f^ c is con¬ 
nected. To see this, note that if f] c is not connected, then we have 
constructed a curve 7 C and found a point w ^ Cl so that W 7 (^) 7 ^ 0, 
thus contradicting the fact that f] is simply connected. 

2 The Jordan curve theorem 

Although we emphasize in the statement of the theorems which follow 
that the curves are piecewise-smooth, we note that the proofs involve the 
use of curves that may only be continuous, (the curves T e below). 

The two main results in this section are the following. 

Theorem 2.1 Let T be curve in the plane that is simple and piecewise- 
smooth. Then, the complement of T is an open connected set whose 
boundary is precisely T • 

Theorem 2.2 Let T be a curve in the plane which is simple, closed, and 
piecewise-smooth. Then, the complement of T consists of two disjoint 
connected open sets. Precisely one of these regions is bounded and simply 
connected; it is called the interior of T and denoted by Q. The other 
component is unbounded, called the exterior of T, and denoted by U. 

Moreover, with the appropriate orientation for we have 



1 if z ^ Q, 
0 ifzeU. 
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Remark. These two theorems continue to hold in the general case 
where we drop the assumption that the curves are piece wise-smooth. 
However, as it turns out, the proofs then are more difficult. Fortunately, 
the restricted setting of piecewise-smooth curves suffices for many appli¬ 
cations. 

As a consequence of the above propositions, we may state a version of 
Cauchy’s theorem as follows: 

Theorem 2.3 Suppose f is a function that is holomorphic in the inte¬ 
rior Q of a simple closed curve T. Then 



whenever 7 ] is any closed curve contained in f]. 

The idea of the proof of Theorem 2.1 can be roughly summarized as 
follows. Since the complement of T is open, it is sufficient to show it 
is pathwise connected (Exercise 5 , Chapter 1 ). Let 2 ： and w belong to 
the complement of T, and join these two points by a curve. If this curve 
intersects T, we first connect z to z 1 and w to w 、where z' and w r are 
close to r, by curves that do not intersect V. Then, we join z r to w r 
by traveling “parallel” to the curve T and going around its end-points if 
necessary. 

Therefore, the key is to construct a family of continuous curves that are 
“parallel” to r. This can be achieved because of the conditions imposed 
on the curve. Indeed, if 7 is a parametrization for a smooth piece of T, 
then 7 is continuously differentiable, and 7’(t) — 0 . Moreover, the vector 
7 ’ ⑷ is tangent to T. Consequently, i^ r (t) is perpendicular to T, and 
if r is simple, considering 7(t) + ie^ r {t) amounts to a new curve that is 
“parallel” to r. The details are as follows. 

In the next three lemmas and two propositions, we emphasize that I^o 
denotes a simple smooth curve. We recall that an arc-length parametriza¬ 
tion 7 for a smooth curve r。satisfies |7’(t)| = 1 for all t. Every smooth 
curve has an arc-length parametrization. 

Lemma 2.4 Let r0 be a simple smooth curve with an arc-length parametriza¬ 
tion given by 7 : [ 0 , L] —>• C. For each real number e, let T e be the con¬ 
tinuous curve defined by the parametrization 


7e ( 亡 ）= 7( 亡 ） + forO<t<L. 

Then, there exists /^i > 0 50 that I\) Pi r e = 0 whenever 0 < |e| < ^ 1 . 
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Proof. We first prove the result locally. If 5 and t belong to [0, L], 
then 


- 7 (s) = 7 (t) - 7 (s) + iey’[t) 

^(u) du + 

W(u) - ^(t)] du+ (t — s + ie) 7 ’ ⑺. 

Since 7 ’ is uniformly continuous on [0, L], there exists 5 > 0 so that 
\y(x) — 7 ’(y)| < 1/2 whenever \x — y\ < 6. In particular, if < 5 

we find that 

he(t) - 7 (s)| > |t — s + ie| |7' ⑷ I - 

Since 7 is an arc-length parametrization, we have | 7 ’(t)| = 1, and hence 

he(t)~7(s)\ > |e|/2, 

where we have used the simple fact that 2\a + ib\ > \a\ + |6| whenever a 
and b are real. This proves that 7 e (t) ^ 7 ( 5 ) whenever \t — s\ < 5 and 
e 7 ^ 0 . 

To conclude the proof of the lemma, we argue as follows. (See Figure 2 
for an illustration of the argument.) 



7(4) 


7(4) 


7(4) 





7e(-^fe) 


7e(4) 


Figure 2. Situation in the proof of Lemma 2.4 


Let 0 = < •.. < = L be a partition of [0, L] with \tk~\~i — tk\ < S 

for all fc, and consider 


I k ^ {t:\t- t k \ < 5/A}, J k ^ {t:\t- t k \ < 5/2} 
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and 


= {^ : \^ ~ ^k\ ^ J/2}- 


Then, we have just proved that 


( 3 ) 


7 ( 4 ) fl 7 e ( Jfc) = 0 whenever e ^ 0. 


Since r。is simple, the distance dk between the two compact sets 7 (ifc) 
and 7 ( J() is strictly positive. We now claim that 


7 ( 4 ) fl 7 e(^fc) = 0 whenever |e| < d k /2. 


⑷ 


Indeed, if 2 ： G 7 (/fc) and w G 7 e (J^), then we choose 5 in J' k so that 
w = 7e(5) and let ^ = 7( s ). The triangle inequality then implies 


z — w\~> 1 - 2 ： — — w\ > dfc — |e| > dfc/ 2 , 


and the claim is established. Finally, if we choose = min/j ； dfc/2, then (3) 
and (4) imply that r。Pi r e = 0 whenever 0 < |e| < /^i, as desired. 

The next lemma shows that any point close to an interior point of the 
curve belongs to one of its parallel translates. By an interior point of 
the curve, we mean a point of the form ^(t) with t in the open interval 
(0, L). Such a point should not to be confused with an “interior” point 
of a curve, as in Theorem 2.2. 

Lemma 2.5 Suppose z is a point which does not belong to the smooth 
curve ro ，but that is closer to an interior point of the curve than to either 
of its end-points. Then z belongs to T e for some e ^ 0. 

More precisely, if zo G To is closest to z and zq = 7 (to) for some to in 
the open interval (0, L), then z = 7(to) + 化7 ’( 亡 o) for some e ^ 0. 

Proof. For t in a neighborhood of to the fact that 7 is differentiable 
guarantees that 


Z — 7(i) 二 2 - 7 (* 0 ) - y'(t 0 )(t - to) + o{\t - io|)- 


Since zq = j(to) minimizes the distance from z to r。, we find that 
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之 _ 之 。| 2 ^ k — 7 ⑷ | 2 = l z ~ z o\ 2 — 2(^ — t 0 )Re (^[z — 7( 亡 o)]7’(’o)) + 


+ o(\t — to|)- 


Since t — to can take on positive or negative values, we must have 



otherwise the above inequality can be vio¬ 


lated for t close to to. As a result, there exists a real number e with 

- 7 (to)] 7 ’ (亡 o) = 《巳 Since |Y ( 亡 o)| = 1 we have 7 ’ ( 亡 o) = 1 / 7 ’ ( 亡 0 ), and 
therefore 2 : — 7(to)= 化 7 ’( 亡 o). The proof of the lemma is complete. 

Suppose that 2 ： and w are close to interior points of Tq , so that z G T e 
and w G for some non-zero e and 77 . If e and r] have the same sign, we 
say that the points 2 : and w belong to the same side of Tq. Otherwise, 
z and w are said to be on opposite sides of r 0 . We stress the fact that 
we do not attempt to define the “two sides of r 0 ，” but only that given 
two points near Fq, we may infer if they are on the “same side” or on 
“opposite sides”. Also, nothing we have done so far shows that these 
conditions are mutually exclusive. 

Roughly speaking, points on the same side can be joined almost di¬ 
rectly by a curve “parallel” to Tq, while for points on opposite sides, we 
also need to go around one of the end-points of Tq. 

We first investigate the situation for points on the same side of r 0 . 

Proposition 2.6 Let A and B denote the two end-points of a simple 
smooth curve ro ， and suppose that K is a compact set that satisfies 
either 


n if = 0 or Tq C\ K = A\J B. 


If z ♦ r 。 and w lie on the same side ofY^, and are closer to interior 
points o/Tq than they are to K or to the end-points ofT^, then z and w 
can be joined by a continuous curve that lies entirely in the complement 
ofKUT 0 . 

The unspecified compact set K will be chosen appropriately in the 
proof of the Jordan curve theorem. 

Proof. By the previous lemma, consider Zq = 7 (^ 0 ) and ^0 = 7 (^ 0 ) 
that are interior points of Tq closest to 之 and w, respectively. Then 


^ = 7(^o) + ^oT^^o) and w = 7 (s 0 ) +^o 7 ， ( 5 o) 
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where eo and r/o have the same sign, which we may assume to be positive. 
We may also assume that to < so- 

The hypothesis of the lemma implies that the line segments joining 之 
to zo and w to wo are entirely contained in the complement of K and 
ro. Therefore, for all small e > 0, we may join 2 ： and w to the points 

2 e = 7(^0) + in'(to) and w = 7(s 0 ) + 《 e7'(s 0 ), 
respectively. See Figure 3. 



Figure 3. Situation in the proof of Proposition 2.6 


Finally, if e is chosen smaller than in Lemma 2.4 and also smaller 
than the distance from K to the part of To between zo and wo, that is, 
(7(^) : to < t < so}? then the corresponding part of r e , namely «t): 
亡 0 幺亡幺 5o}，joins the point z e to w € . Moreover, this curve is contained 
in the complement of K and TV This proves the proposition. 

To join points on opposite sides of To, we need the following prelimi¬ 
nary result, which ensures that there is enough room necessary to travel 
around the end-points. 

Lemma 2.7 Let To be a simple smooth curve. There exists /^2 > 0 so 
that the set N , which consists of points of the form z = + ee z0 7’(L) 

with —7r/2 < d < 7r/2 and 0 < e < k ， 2 , disjoint from Tq• 

Proof. The argument is similar to the one given in the proof of 
Lemma 2.4. First, we note that 

7(i)+ ee ie y(L)-^f(t)= [ Wiu) - 7’(L)] du-\- (L -t-\- ee ie W(L). 
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If we choose 5 so that \y(u) — 7 ’(L)| < 1/2 when \u — L\ < 5, then 
\t — L\ < 6 implies 


h(L) + ee l6 y(L) > \e\/2. 

Therefore ^/(t) ^ N whenever L — 5 < t < L. Finally, it suffices to choose 
k ，2 smaller than the distance from the end-point 7 (L) to the rest of the 
curve 7 (t) with 0 <t < L — 5, to conclude the proof. 

Finally, we may state the result analogous to Proposition 2.6 for points 
that could lie on opposite sides of To. 

Proposition 2.8 Let A denote an end-point of the simple smooth curve 
r 。， and suppose that K is a compact set that satisfies either 

ro n if = 0 or Tq n k = A. 


If z 牵 F q and w are closer to interior points o/Tq than they are to 
K or to the end-points oJTq, then z and w can be joined by a continuous 
curve that lies entirely in the complement of Tq\J K. 

We only provide an outline of the argument, which is similar to the 
proof of Proposition 2.6. It suffices to consider the case when z and w 
lie on opposite sides of Tq and A = 7(0). First, we may join 

A = 7(’o) + W(^o) and w e = 7(5 0 ) - iq’(s Q ) 

to the points 

z r e = "(L) + ie 7 ’(L) and w r e = 7 (L) — 

Then, z r e and w’ e may be joined within the “half-neighborhood” N of 
Lemma 2.7. Here, if to < 5o we must select |e| smaller than the dis¬ 
tance from : < 亡幺乙 } to if, and also smaller than Ki and of 

Lemmas 2.4 and 2.7. 

Proof of Theorem 2.1 

Let r be a simple piecewise-smooth curve. 

First, we prove that the boundary of the set O = T C is precisely T. 
Clearly, O is an open set whose boundary is contained in T. Moreover, 
any point where T is smooth also belongs to the boundary of O (by 
Lemma 2.4 for instance). Since the boundary of O must also be closed, 
we conclude it is equal to all of T. 
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Figure 4. Situation in the proof of Proposition 2.8 


The proof that O is connected is by induction on the number of smooth 
curves constituting T. Suppose first that T is simple and smooth, and let 
Z and W be any two points that do not lie on T. Let A be any smooth 
curve in C that joins Z and W, and which omits the two end-points of T. 
If A intersects r, it does so at interior points. Therefore, we may join Z 
by a piece of A that does not intersect r to a point z that is closer to the 
interior of T than to either of its end-points. Similarly, W can be joined 
in the complement of T to a point w also closer to the interior of T than 
to either of its end-points. Proposition 2.8 (with K empty) then shows 
that 2 ： and w can be joined by a continuous curve in the complement of 
r. Altogether, we may join any two points in the complement of T, and 
this proves the base step of the induction. 

Suppose that the theorem is proved for all curves containing n — 1 
smooth curves, and let T consist of n smooth curves, so that we may 
write 


r = i^ur 0 , 

where K is the union of n — 1 consecutive smooth curves, and ro is 
smooth. In particular, K is compact and intersects ro in a single one of 
its end-points. By the induction hypothesis, any two points Z and W in 
the complement of T can be joined by a curve that does not intersect if, 
and we may also assume that this curve omits both end-points of rV If 
this curve intersects r。in its interior, then we apply Proposition 2.8 to 
conclude the proof of the theorem. 
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Proof of Theorem 2.2 

Let r denote a curve which is simple, closed, and piecewise-smooth. We 
first prove that the complement of T consists of at most two components. 

Fix a point W that lies outside some large disc that contains T, and 
let U denote the set of all points that can be joined to W by a continuous 
curve that lies entirely in the complement of T. The set IA is clearly open, 
and also connected since any two points can be joined by passing first 
through W. Now we define 


n = T c -U. 

We must show that is connected. To this end, let K denote the curve 
obtained by deleting a smooth piece Tq of T. By the Jordan arc theorem, 
we may join any point Z G to W by a curve Az that does not intersect 
K. Since Z 味 U, the curve Az must intersect Fq at one of its interior 
points. We may therefore choose two points z^w ^ Kz closer to interior 
points of Tq than to either of its end-points, and so that the pieces of Az 
joining Z to z and W to w are entirely contained in the complement of 
r. Then, the points 2： and w are on opposite sides of Tq, for otherwise, 
we could apply Proposition 2.6 to find that Z can be joined to W by a 
curve lying in the complement of T, and this contradicts Z 朱 hi. Finally, 
if Z\ is another point in f], the two corresponding points Z\ and w\ 
must also lie on opposite sides of Tq. Moreover, 2： and Z\ must lie on 
the same side of To, for otherwise 2： and w\ do, and we can once again 
join Z to W without crossing T, thus contradicting Z 朱 IA. Therefore, 
by Proposition 2.6 the points z and z\ can be joined by a curve in the 
complement of T, and we conclude that Z and Z\ belong to the same 
connected component. 

The argument thus far proves that r c contains at most two compo¬ 
nents, but nothing as yet guarantees that is non-empty. To show that 
r c has precisely two components, it suffices (by Lemma 1.3) to prove 
that there are points that have different winding numbers with respect 
to r. In fact, we claim that points that are on opposite sides of T have 
winding numbers that differ by 1. To see this, fix a point zq on a smooth 
part of r, say zo = j(to), let e > 0, and define 

z e = 7 (to) + ier/’ and w e = 7 (^ 0 ) - 

By our previous observations, points on the same side of T belong to the 
same connected component, and hence 


A = \W r (z e )-W r (w e )\ 
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is constant for all small e > 0. 

First, we may write 

f 7’ ⑷ _ I'it) \ = 2ie7%)7' ⑷ 

\l(t) - Ze l{t) ~w e J [ 7 ⑷一 + e 2 Y{to) 2 ' 

For the numerator, we use 

V(i) = Y( 知 ) + [7’ ⑴- 7' ⑹] 

= 7’ (亡 o ) + 雜)， 

where ^(t) — > 0 as t ^ to. For the denominator, we recall that Y(to) ^ 0, 
so that 


[7 ⑴一 7( 亡 o)] 2 + e 2 7’ (亡 o) 2 = 7\^o) 2 [{t — to) 2 + e 2 ] + o(\t — to|)- 


Putting these results together, we see that 


( 作） _ 7 ； ( 0 、 = 2ie , E(t) 

V 7( i ) - l { t )- W £ ) - ( t - t 0 )2 + e 2^ U ’ 

where given 77 > 0, there exists 5 > 0 so that if \t — to\ < 5, the error term 
satisfies 


\ E (t)\ < V 


{t-t o y+e^ 


We then write 

△= 丄 / 

2 们 J\t-t 0 \>S V7(i) - 


_m_\ 

7 (i) - wj 


dt 


+ — [ (-—— ^~j + e ⑴）也 

2m i| t _t 0 | <5 \(t-t 0 ) 2 + e 2 y 7 

The first integral goes to 0 as e 0. In the second integral we make the 
change of variables t — to = es, and note that 

1 f p ds 1 r 1 

—/ - = — arc tan 5 ^ 1 as p —> 00 . 

7T J_ p S Z ~\~1 7T P 


We therefore see that letting e —>• 0 gives 

|A — 1| < 7]. 
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We conclude that A = 1, and hence r c has precisely two components. 
Finally, only one of these components can be unbounded, namely ZY, and 
the winding number of T in this component must therefore be zero. By 
our last result, we see that, after possibly reversing the orientation of the 
curve, the winding number of any point in the bounded component f] is 
constant and equal to 1. Also, it is clear from what has been said that 
any smooth point on T can be approached by points in either component, 
and hence T is the boundary of both and U • 

The final step in the proof is to show that the interior of the curve, 
that is, the bounded component f], is simply connected. By Theorem 1.2 
it suffices to show that f] c is connected. If not, then 


fi c = FiUF 2 , 

where F\ and F 2 are closed, disjoint, and non-empty. Let 
0\ = W D and O 2 = U D F 2 . 

Clearly, 0\ and O 2 are disjoint. If 2 ： G Oi, then z G ZY, and every small 
ball centered at 2 ： is contained in U. If every such ball intersects F 2 , 
then z E F 2 since F 2 is closed. However, F\ and F 2 are disjoint, so this 
cannot happen. Consequently, Oi is open, and by the same argument, so 
is O 2 - Finally, we claim that Oi is non-empty. If not, then F\ is entirely 
contained in T and U. is contained in F 2 . Pick any point z 6 F\, which 
we know belongs to T. Now every ball centered at ^ intersects IA, hence 
F 2 . But F 2 is closed and disjoint from Fi, so we get a contradiction. A 
similar argument for 0\ proves that 


u = o 1 uo 2 , 

where (9i, O 2 are disjoint, open, and non-empty. This contradicts the 
fact that U is connected, and concludes the proof of the Jordan curve 
theorem for piecewise-smooth curves. 


2.1 Proof of a general form of Cauchy’s theorem 

Theorem 2.9 If a function f is holomorphic in an open set that con¬ 
tains a simple closed piecewise-smooth curve T and its interior, then 



Let O denote an open set on which / is holomorphic, and which con¬ 
tains r and its interior f]. The idea is to construct a closed curve A 
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in f] that is so close to T that J r f = f A /• Then, the integral on the 
right-hand side is 0, since / is holomorphic in the simply connected open 
set f]. We build A as follows. Near the smooth parts of T, the curve A 
is essentially a curve like r e in Lemma 2.4. Near points where smooth 
parts of r join, we shall use for A an arc of a circle. This is illustrated 
in Figure 5. 


r 



To find the appropriate connecting arcs, we need the following prelim¬ 
inary result. 

Lemma 2.10 Let 7 : [0,1] — >• C 6e a simple smooth curve. Then, for all 
sufficiently small (5 > 0 the circle Cs centered at 7(0) and of radius 5 
intersects 7 in precisely one point. 

Proof. We may assume that 7(0) = 0. Since 7(0) 笋 7(1) it is clear 
that for each small 5 > 0, the circle Cs intersects 7 in at least one point. 
If the conclusion in the lemma is false, we can find a sequence of positive 
6j going to 0, and so that the equation |7(t)| = 5j has at least two distinct 
solutions. The mean value theorem applied to h(t)= 卜⑴ | 2 provides a 
sequence of positive numbers tj so that tj —^ 0 and h'(tj) = 0. Thus 

7 ' ⑹ . 7 ( 心 ）= 0 for all j. 

However, the curve is smooth, so 

7(t) = 7(0) + 7’(0)t + tcp(t) and ^(t) = 7’(0) + 狀 t), 
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where \(f(t)\ 0 and ^ 0 as t goes to 0. Then recalling that 7 ( 0 )= 

0, we find 7 ’ ⑷ . ^(t)= 卜’⑼ | 2 t + o(|t|). The definition of a smooth curve 
also requires that 7 ’( 0 ) — 0 , so the above gives 

7 ’(t) . 7 (t) 乂 0 for all small t. 


This is the desired contradiction. 

Returning to the proof of Cauchy’s theorem, choose e so small that 
the open set U of all points at a distance < e of T is contained in O. 

Next, if Pi,..., P n denote the consecutive points where smooth parts 
of r join, we may pick 5 < e/10 so small that each circle Cj centered at a 
point Pj and of radius 6 intersects T in precisely two distinct points (this 
is possible by the previous lemma). These two points on Cj determine 
two arcs of circles, only one of which (denoted by Cj) has an interior 
entirely contained in f2. To see this, it suffices to recall that if 7 is a 
parametrization of a smooth part of T with end-point Pj, then for all 
small e’ the curves parametrized by 7 e / and 7 _ e / of Lemma 2.4 lie on 
opposite sides of V and must intersect the circle Cj. By construction the 
disc D* centered at Pj and of radius 25 is also contained in W, hence 
in O. 



We wish to construct A so that we may argue as in the proof of The¬ 
orem 5.1, Chapter 3 and establish f r f = f A /• To do so, we consider a 
chain of discs T> = {Do, …， Dk} contained in IA, and so that T is con¬ 
tained in their union, with H Dk~\~i 7 ^ 0 , Dq = Dk, and with the discs 
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D* part of the chain V. Suppose Tj is the smooth part of T that joins Pj 
to By Lemma 2.4 it is possible to construct a continuous curve Aj 

that is contained in f] and in the union of the discs, and which connects 
a point on Bj on Cj to a point on 為 +1 on Cj+i (see Figure 6). Since 
we only assumed that T has one continuous derivative, Aj need not be 
smooth, but by approximating this continuous curve by polygonal lines 
if necessary, we may actually assume that Aj is also smooth. Then, A)+i 
is joined to Bj+i by a piece of Cj+i, and so on. This procedure provides 
a piecewise-smooth curve A that is closed and contained in f]. 

Since / has a primitive on each disc of the family P, we may argue as 
in the proof of Theorem 5.1, Chapter 3 to find that f r f= f A /• Since 
f] is simply connected, we have J A / = 0, and as a result 



Notes and References 


Useful references for many of the subjects treated here are Saks and Zygmund [34], 
Ahlfors [2], and Lang [23]. 

Introduction 

The citation is from Riemann’s dissertation [32]. 

Chapter 1 

The citation is a free translation of a passage in Borel’s book [6]. 

Chapter 2 

The citation is a translation of an excerpt from Cauchy 5 s memoir [7]. 

Results related to the natural boundaries of holomorphic functions in the unit 
disc can be found in Titchmarsh [36]. 

The construction of the universal functions in Problem 5 are due to G. D. Birkhoff 
and G.R. MacLane. 

Chapter 3 

The citation is a translation of a passage in Cauchy 5 s memoir [8]. 

Problem 1 and other results related to injective holomorphic mappings (uni¬ 
valent functions) can be found in Duren [11]. 

Also, see Muskhelishvili [25] for more about the Cauchy integral introduced 
in Problem 5. 

Chapter 4 

The citation is from Wiener [40]. 

The argument in Exercise 1 was discovered by D. J. Newman; see [4]. 

The Paley-Wiener theorems appeared first in [28]; further generalizations can 
be found in Stein and Weiss [35]. 

Results related to the Borel transform (Problem 4) can be found in Boas [5]. 

Chapter 5 

The citation is a translation from the German of a passage in a letter from 
K. Weierstrass to S. Kowalewskaja; see [38]. 

A classical reference for Nevanlinna theory is the book by R. Nevanlinna 
himself [27]. 

Chapter 6 

A number of different proofs of the analytic continuation and functional equation 
for the zeta function can be found in Chapter 2 of Titchmarsch [37]. 

Chapter 7 

The citation is from Hadamard [14]. Riemann’s statement concerning the zeroes 
of the zeta function in the critical strip is a passage taken from his paper [33]. 

Further material related to the proof of the prime number theorem presented 
in the text is in Chapter 2 of Ingham [19], and Chapter 3 of Titchmarsch [37]. 
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NOTES AND REFERENCES 


The “elementary” analysis of the distribution of primes (without using the 
analytic properties of the zeta function) was initiated by Tchebychev, and cul¬ 
minated in the Erdos-Selberg proof of the prime number theorem. See Chap¬ 
ter XXII in Hardy and Wright [17]. 

The results in Problems 2 and 3 can be found in Chapter 4 of Ingham [19]. 
For Problem 4, consult Estermann [13]. 

Chapter 8 

The citation is from Christoffel [9]. 

A systematic treatment of conformal mappings is Nehari [26]. 

Some history related to the Riemann mapping theorem, as well as the details 
in Problem 7, can be found in Remmert [31]. 

Results related to the boundary behavior of holomorphic functions (Prob¬ 
lem 6) are in Chapter XIV of Zygmund [41]. 

An introduction to the interplay between the Poincare metric and complex 
analysis can be found in Ahlfors [1]. For further results on the Schwartz-Pick 
lemma and hyperbolicity, see Kobayashi [21]. 

For more on Bieberbach’s conjecture, see Chapter 2 in Duren [11] and Chap¬ 
ter 8 in Hay man [18]. 

Chapter 9 

The citation is taken from Poincare [30]. 

Problems 2, 3, and 4 are in Saks and Zygmund [34]. 

Chapter 10 

The citation is from Hardy, Chapter IX in [16]. 

A systematic account of the theory of theta functions and Jacobi’s theory of 
elliptic functions is in Whittaker and Watson [39], Chapters 21 and 22. 

Section 2. For more on the partition function, see Chapter XIX in Hardy and 
Wright [17]. 

Section 3. The more standard proofs of the theorems about the sum of two 
and four squares are in Hardy and Wright [17], Chapter XX. The approach we 
use was developed by Mordell and Hardy [15] to derive exact formulas for the 
number of representations as the sum of k squares, when k > 5. The special 
case /c = 8 is in Problem 6. For /c < 4 the method as given there breaks down 
because of the non-absolute convergence of the associated “Eisenstein series.” In 
our presentation we get around this difficulty by using the “forbidden” Eisenstein 
series. When /c = 2, an entirely different construction is needed, and the analysis 
centering around C(r) is a further new aspect of this problem. 

The theorem on the sum of three squares (Problem 1) is in Part I, Chapter 4 
of Landau [22]. 

Appendix A 

The citation is taken from the appendix in Airy’s article [3]. 

For systematic accounts of Laplace’s method, stationary phase, and the method 
of steepest descent, see Erdelyi [12] and Copson [10]. 

The more refined asymptotics of the partition function can be found in Chap¬ 
ter 8 of Hardy [16]. 
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Appendix B 

The citation is taken from Picard’s address found in Jordan’s collected works [20]. 

The proof of the Jordan curve theorem for piecewise-smooth curve due to 
Pederson [29] is an adaptation of the proof for polygonal curves which can be 
found in Saks and Zygmund [34]. 

For a proof of the Jordan theorem for continuous curves using notions of 
algebraic topology, see Munkres [24]. 
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Symbol Glossary 


Re(z), Im ⑷ 
argz 

H ， 芝 — 

(-^0) ? ■^r(ZO) 

Cr (-^0 ) 

D, C 
D 

f^ c , n, on 

diam(fi) 

d_ d_ 

dz 5 d~z 

e 2 , cos 2：, sin z 
7~ 

O, o ，〜 

A 

F(a,/3,7 ； 
rpG f 

Prh), Vy{x) 
cosh 2：, sinhz 
§ 

log, logo 

m 

■Saj -S 


Real and Imaginary parts 2 

Argument of 之 4 

Absolute value and complex conjugate 3, 3 

Open and closed discs centered at z 0 5, 6 

and with radius r 

Circle centered at zo with radius r 6 

Generic disc and circle 

Unit disc 6 

Complement, closure, and boundary of 6 

Diameter of 6 

Differential operators 12 

Complex exponential and trigonometric 14, 16 

functions 

Reverse parametrization 19 

Bounds and asymptotic relations 24 

Laplacian 27 

Hypergeometric series 28 

Residue 75 

Poisson kernels 67, 78 

Hyperbolic cosine and sine 81, 83 

Riemann sphere 89 

Logarithms 98, 99 

Fourier transform 111 

Class of functions with moderate decay in 113, 114 
strips 

Horizontal strips 113, 160 

Order of growth 138 

Canonical factors 145 

Blaschke factors 153 


The page numbers on the right indicate the first time the symbol or 
notation is defined or used. As usual, Z, Q, M, and C denote the integers, 
the rationals, the reals, and the complex numbers respectively. 
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r ⑷ 

Gamma function 

160 

C(s) 

Riemann zeta function 

168 

汐， 0(z|r), 0(t) 

Theta function 

169, 284, 284 

C(s) 

Xi function 

169 

Ju 

Bessel functions 

176 

B m 

Bernoulli number 

179 

n(x) 

Number of primes < x 

182 

f(x) « g{x) 

Asymptotic relation 

182 

^(x), A(n), ipi(x) 

Functions of Tchebychev 

188, 189, 190 

d(n) 

Number of divisors of n 

200 

cr a {n) 

Sum of the a th powers of divisors of n 

200 

M ㈤ 

Mobius function 

200 

Li(x) 

Approximation to tt(x) 

202 

e 

Upper half-plane 

208 

Aut(f2) 

Automorphism group of f] 

219 

SL 2 (E) 

Special linear group 

222 

psl 2 (m) 

Projective special linear group 

223 

SU(1,1) 

Group of fractional linear transforma¬ 
tions 

257 

A, A* 

Lattice and lattice minus the origin 

262, 267 

P 

Weierstrass elliptic function 

269 

E k {r), E*{t) 

Eisenstein series 

273, 305 

F(r), F(r) 

Forbidden Eisenstein series and its re¬ 
verse 

278, 305 

n ( 拿 ) 

Triple product 

286 

"( T ) 

Dedekind eta function 

292 

p{n) 

Partition function 

293 

r2(n) 

Number of ways n is a sum of two squares 

296 

r4(n) 

Number of ways n is a sum of four 
squares 

297 

di(n), d 3 (n), c^(n) 

Divisor functions 

297, 304 

Ai (s) 

Airy function 

328 

W^(z) 

Winding number 

347 


Index 


Relevant items that also arise in Book I are listed in this index, 


preceeded by the numeral I. 


Abel’s theorem, 28 
Airy function, 328 
amplitude, 323; (1)3 
analytic continuation, 53 
analytic function, 9, 18 
angle preserving, 255 
argument principle, 90 
arithmetic-geometric mean, 260 
automorphisms, 219 
of the disc, 220 
of the upper half-plane, 222 
axis 

imaginary, 2 
real, 2 

Bernoulli 

numbers, 179, 180; (1)97, 167 
polynomials, 180; (1)98 
Bessel function, 29, 176, 319; 
(1)197 

Beta function, 175 
Bieberbach conjecture, 259 
Blaschke 

factors, 26, 153, 219 
products, 157 
bump functions, (1)162 

canonical factor, 145 
degree, 145 

Casorati-Weierstrass theorem, 

86 

Cauchy inequalities, 48 
Cauchy integral formulas, 48 
Cauchy sequence, 5; (1)24 


Cauchy theorem 
for a disc, 39 

for piece wise-smooth curves, 
361 

for simply connected regions, 
97 

Cauchy-Riemann equations, 12 
chain rule 

complex version, 27 
for holomorphic functions, 10 
circle 

negative orientation, 20 
positive orientation, 20 
closed disc, 6 
complex differentiable, 9 
complex number 
absolute value, 3 
argument, 4 
conjugate, 3 
imaginary part, 2 
polar form, 4 
purely imaginary, 2 
real part, 2 
component, 26 
conformal 

equivalence, 206 
map, 206 

mapping onto polygons, 231 
connected 
closed set, 7 
component, 26 
open set, 7 
pathwise, 25 
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cotangent (partial fractions), 
142 

critical points, 326 
critical strip, 184 
curve, 20; (1)102 
closed, 20; (1)102 
end-points, 20 
length, 21; (1)102 
piecewise-smooth, 20 
simple, 20; (1)102 
smooth, 19 
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Foreword 


Beginning in the spring of 2000, a series of four one-semester courses 
were taught at Princeton University whose purpose was to present, in 
an integrated manner, the core areas of analysis. The objective was to 
make plain the organic unity that exists between the various parts of the 
subject, and to illustrate the wide applicability of ideas of analysis to 
other fields of mathematics and science. The present series of books is 
an elaboration of the lectures that were given. 

While there are a number of excellent texts dealing with individual 
parts of what we cover, our exposition aims at a different goal: pre¬ 
senting the various sub-areas of analysis not as separate disciplines, but 
rather as highly interconnected. It is our view that seeing these relations 
and their resulting synergies will motivate the reader to attain a better 
understanding of the subject as a whole. With this outcome in mind, we 
have concentrated on the main ideas and theorems that have shaped the 
field (sometimes sacrificing a more systematic approach), and we have 
been sensitive to the historical order in which the logic of the subject 
developed. 

We have organized our exposition into four volumes, each reflecting 
the material covered in a semester. Their contents may be broadly sum¬ 
marized as follows: 

I. Fourier series and integrals. 

II. Complex analysis. 

III. Measure theory, Lebesgue integration, and Hilbert spaces. 

IV. A selection of further topics, including functional analysis, distri¬ 
butions, and elements of probability theory. 

However, this listing does not by itself give a complete picture of 
the many interconnections that are presented, nor of the applications 
to other branches that are highlighted. To give a few examples: the ele¬ 
ments of (finite) Fourier series studied in Book I, which lead to Dirichlet 
characters, and from there to the infinitude of primes in an arithmetic 
progression; the X-ray and Radon transforms, which arise in a number of 
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problems in Book I, and reappear in Book III to play an important role in 
understanding Besicovitch-like sets in two and three dimensions; Fatou’s 
theorem, which guarantees the existence of boundary values of bounded 
holomorphic functions in the disc, and whose proof relies on ideas devel¬ 
oped in each of the first three books; and the theta function, which first 
occurs in Book I in the solution of the heat equation, and is then used 
in Book II to find the number of ways an integer can be represented as 
the sum of two or four squares, and in the analytic continuation of the 
zeta function. 

A few further words about the books and the courses on which they 
were based. These courses where given at a rather intensive pace, with 48 
lecture-hours a semester. The weekly problem sets played an indispens¬ 
able part, and as a result exercises and problems have a similarly im¬ 
portant role in our books. Each chapter has a series of “Exercises” that 
are tied directly to the text, and while some are easy, others may require 
more effort. However, the substantial number of hints that are given 
should enable the reader to attack most exercises. There are also more 
involved and challenging “Problems ”； the ones that are most difficult, or 
go beyond the scope of the text, are marked with an asterisk. 

Despite the substantial connections that exist between the different 
volumes, enough overlapping material has been provided so that each of 
the first three books requires only minimal prerequisites: acquaintance 
with elementary topics in analysis such as limits, series, differentiable 
functions, and Riemann integration, together with some exposure to lin¬ 
ear algebra. This makes these books accessible to students interested 
in such diverse disciplines as mathematics, physics, engineering, and 
finance, at both the undergraduate and graduate level. 

It is with great pleasure that we express our appreciation to all who 
have aided in this enterprise. We are particularly grateful to the stu¬ 
dents who participated in the four courses. Their continuing interest, 
enthusiasm, and dedication provided the encouragement that made this 
project possible. We also wish to thank Adrian Banner and Jose Luis 
Rodrigo for their special help in running the courses, and their efforts to 
see that the students got the most from each class. In addition, Adrian 
Banner also made valuable suggestions that are incorporated in the text. 
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We wish also to record a note of special thanks for the following in¬ 
dividuals: Charles Fefferman, who taught the first week (successfully 
launching the whole project!); Paul Hagelstein, who in addition to read¬ 
ing part of the manuscript taught several weeks of one of the courses, and 
has since taken over the teaching of the second round of the series; and 
Daniel Levine, who gave valuable help in proof-reading. Last but not 
least, our thanks go to Gerree Pecht, for her consummate skill in type¬ 
setting and for the time and energy she spent in the preparation of all 
aspects of the lectures, such as transparencies, notes, and the manuscript. 

We are also happy to acknowledge our indebtedness for the support 
we received from the 250th Anniversary Fund of Princeton University, 
and the National Science Foundation’s VIGRE program. 


Elias M. Stein 
Rami Shakarchi 

Princeton, New Jersey 
August 2002 


In this third volume we establish the basic facts concerning measure 
theory and integration. This allows us to reexamine and develop further 
several important topics that arose in the previous volumes, as well as to 
introduce a number of other subjects of substantial interest in analysis. 
To aid the interested reader, we have starred sections that contain more 
advanced material. These can be omitted on first reading. We also want 
to take this opportunity to thank Daniel Levine for his continuing help in 
proof-reading and the many suggestions he made that are incorporated 
in the text. 
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Introduction 


I turn away in fright and horror from this lamentable 
plague of functions that do not have derivatives. 

C. Hermite, 1893 


Starting in about 1870 a revolutionary change in the conceptual frame¬ 
work of analysis began to take shape, one that ultimately led to a vast 
transformation and generalization of the understanding of such basic ob¬ 
jects as functions, and such notions as continuity, differentiability, and 
integrability. 

The earlier view that the relevant functions in analysis were given by 
formulas or other “analytic” expressions, that these functions were by 
their nature continuous (or nearly so), that by necessity such functions 
had derivatives for most points, and moreover these were integrable by 
the accepted methods of integration — all of these ideas began to give 
way under the weight of various examples and problems that arose in 
the subject, which could not be ignored and required new concepts to 
be understood. Parallel with these developments came new insights that 
were at once both more geometric and more abstract: a clearer under¬ 
standing of the nature of curves, their rectifiability and their extent; also 
the beginnings of the theory of sets, starting with subsets of the line, the 
plane, etc., and the “measure” that could be assigned to each. 

That is not to say that there was not considerable resistance to the 
change of point-of-view that these advances required. Paradoxically, 
some of the leading mathematicians of the time, those who should have 
been best able to appreciate the new departures, were among the ones 
who were most skeptical. That the new ideas ultimately won out can 
be understood in terms of the many questions that could now be ad¬ 
dressed. We shall describe here, somewhat imprecisely, several of the 
most significant such problems. 
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INTRODUCTION 


1 Fourier series: completion 

Whenever / is a (Riemann) integrable function on [—7r, tt] we defined in 
Book I its Fourier series / 〜 E a n e inx by 



and saw then that one had Parseval’s identity, 



However, the above relationship between functions and their Fourier 
coefficients is not completely reciprocal when limited to Riemann inte¬ 
grable functions. Thus if we consider the space TZ of such functions with 
its square norm, and the space £ 2 (Z) with its norm, 1 each element / in 
TZ assigns a corresponding element {a n } in 仑 2 (Z), and the two norms are 
identical. However, it is easy to construct elements in £ 2 (Z) that do not 
correspond to functions in TZ. Note also that the space £ 2 {Z) is complete 
in its norm, while TZ is not. 2 Thus we are led to two questions: 

(i) What are the putative “functions” / that arise when we complete 
VJl In other words: given an arbitrary sequence {a n } G ^ 2 (Z) what 
is the nature of the (presumed) function / corresponding to these 
coefficients? 

(ii) How do we integrate such functions / (and in particular verify (1))? 

2 Limits of continuous functions 

Suppose {/ n } is a sequence of continuous functions on [0,1]. We assume 
that lim n ^oo f n (x) = f(x) exists for every x, and inquire as to the nature 
of the limiting function /. 

If we suppose that the convergence is uniform, matters are straight¬ 
forward and / is then everywhere continuous. However, once we drop 
the assumption of uniform convergence, things may change radically and 
the issues that arise can be quite subtle. An example of this is given by 
the fact that one can construct a sequence of continuous functions {/ n } 
converging everywhere to / so that 


1 We use the notation of Chapter 3 in Book I. 

2 See the discussion surrounding Theorem 1.1 in Section 1, Chapter 3 of Book I. 
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(a) 0 < /n(^) < 1 for all x. 

(b) The sequence / n (x) is montonically decreasing as n —> oo. 

(c) The limiting function / is not Riemann integrable. 3 

However, in view of (a) and (b), the sequence f n (x) dx converges to 
a limit. So it is natural to ask: what method of integration can be used 
to integrate / and obtain that for it 

[f(x) dx = lim [ f n (x)dx? 

Jo n ^°° Jo 

It is with Lebesgue integration that we can solve both this problem 
and the previous one. 

3 Length of curves 

The study of curves in the plane and the calculation of their lengths 
are among the first issues dealt with when one learns calculus. Suppose 
we consider a continuous curve T in the plane, given parametrically by 
r = ⑼ }， a < t < 6, with x and y continuous functions of t. We 

define the length of T in the usual way: as the supremum of the lengths 
of all polygonal lines joining successively finitely many points of T, taken 
in order of increasing t. We say that T is rectifiable if its length L is 
finite. When x(t) and y{t) are continuously differentiable we have the 
well-known formula, 

( 2 ) / ((/⑴) 2 + ( 2 / ⑴ ) 2 ) V 2 也 . 

J a 

The problems we are led to arise when we consider general curves. 
More specifically, we can ask: 

(i) What are the conditions on the functions x{t) and y(t) that guar¬ 
antee the rectifiability of r? 

(ii) When these are satisfied, does the formula (2) hold? 

The first question has a complete answer in terms of the notion of func¬ 
tions of “bounded variation.” As to the second, it turns out that if x and 
y are of bounded variation, the integral (2) is always meaningful; how¬ 
ever, the equality fails in general, but can be restored under appropriate 
reparametrization of the curve T. 


3 The limit / can be highly discontinuous. See, for instance, Exercise 10 in Chapter 1. 
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There are further issues that arise. Rectifiable curves, because they 
are endowed with length, are genuinely one-dimensional in nature. Are 
there (non-rectifiable) curves that are two-dimensional? We shall see 
that, indeed, there are continuous curves in the plane that fill a square, 
or more generally have any dimension between 1 and 2, if the notion of 
fractional dimension is appropriately defined. 

4 Differentiation and integration 

The so-called “fundamental theorem of the calculus” expresses the fact 
that differentiation and integration are inverse operations, and this can 
be stated in two different ways, which we abbreviate as follows: 

(3) F(b)-F(a)= [ F\x) dx, 


⑷ J f(y)dy = f{x). 

For the first assertion, the existence of continuous functions F that are 
nowhere differentiable, or for which F r {x) exists for every x, but F' is 
not integrable, leads to the problem of finding a general class of the F for 
which (3) is valid. As for (4), the question is to formulate properly and 
establish this assertion for the general class of integrable functions / that 
arise in the solution of the first two problems considered above. These 
questions can be answered with the help of certain “covering” arguments, 
and the notion of absolute continuity. 

5 The problem of measure 

To put matters clearly, the fundamental issue that must be understood 
in order to try to answer all the questions raised above is the problem 
of measure. Stated (imprecisely) in its version in two dimensions, it 
is the problem of assigning to each subset E of M 2 its two-dimensional 
measure 7712 (-®), that is, its “area,” extending the standard notion defined 
for elementary sets. Let us instead state more precisely the analogous 
problem in one dimension, that of constructing one-dimensional measure 
mi = m, which generalizes the notion of length in M. 

We are looking for a non-negative function m defined on the family of 
subsets E oiR that we allow to be extended-valued, that is, to take on 
the value + 00 . We require: 


5. The problem of measure 


XIX 


(a) m(E) = b — a ii E is the interval [a, 6], a < 6, of length b — a. 

(b) m(E) = m(E n ) whenever E = E n and the sets E n are 

disjoint. 

Condition (b) is the “countable additivity” of the measure m. It implies 
the special case: 

(t/) m(Ei U E 2 ) = m(Ei) + m(E 2 ) if E\ and E 2 are disjoint. 

However, to apply the many limiting arguments that arise in the theory 
the general case (b) is indispensable, and (b’）by itself would definitely 
be inadequate. 

To the axioms (a) and (b) one adds the translation-invariance of m, 
namely 

(c) m{E -\- h) = m(E), for every /i G M. 

A basic result of the theory is the existence (and uniqueness) of such 
a measure, Lebesgue measure, when one limits oneself to a class of rea¬ 
sonable sets, those which are “measurable.” This class of sets is closed 
under countable unions, intersections, and complements, and contains 
the open sets, the closed sets, and so forth. 4 

It is with the construction of this measure that we begin our study. 
From it will flow the general theory of integration, and in particular the 
solutions of the problems discussed above. 

A chronology 

We conclude this introduction by listing some of the signal events that 
marked the early development of the subject. 

1872 — Weierstrass’s construction of a nowhere differentiable function. 

1881 — Introduction of functions of bounded variation by Jordan and 
later (1887) connection with rectifiability. 

1883 — Cantor’s ternary set. 

1890 — Construction of a space-filling curve by Peano. 

1898 — Borel’s measurable sets. 

1902 — Lebesgue’s theory of measure and integration. 

1905 — Construction of non-measurable sets by Vitali. 

1906 — Fatou’s application of Lebesgue theory to complex analysis. 


4 There is no such measure on the class of all subsets, since there exist non-measurable 
sets. See the construction of such a set at the end of Section 3, Chapter 1. 




Measure Theory 


The sets whose measure we can define by virtue of the 
preceding ideas we will call measurable sets; we do 
this without intending to imply that it is not possible 
to assign a measure to other sets. 

E. Borel, 1898 


This chapter is devoted to the construction of Lebesgue measure in 
and the study of the resulting class of measurable functions. After some 
preliminaries we pass to the first important definition, that of exterior 
measure for any subset E of This is given in terms of approximations 
by unions of cubes that cover E. With this notion in hand we can 
define measurability and thus restrict consideration to those sets that 
are measurable. We then turn to the fundamental result: the collection 
of measurable sets is closed under complements and countable unions, 
and the measure is additive if the subsets in the union are disjoint. 

The concept of measurable functions is a natural outgrowth of the 
idea of measurable sets. It stands in the same relation as the concept 
of continuous functions does to open (or closed) sets. But it has the 
important advantage that the class of measurable functions is closed 
under pointwise limits. 

1 Preliminaries 

We begin by discussing some elementary concepts which are basic to the 
theory developed below. 

The main idea in calculating the “volume” or “measure” of a subset 
of M. d consists of approximating this set by unions of other sets whose 
geometry is simple and whose volumes are known. It is convenient to 
speak of “volume” when referring to sets in but in reality it means 
“area” in the case d = 2 and “length” in the case d = 1. In the approach 
given here we shall use rectangles and cubes as the main building blocks 
of the theory: in R we use intervals, while in we take products of 
intervals. In all dimensions rectangles are easy to manipulate and have 
a standard notion of volume that is given by taking the product of the 
length of all sides. 
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Next, we prove two simple theorems that highlight the importance of 
these rectangles in the geometry of open sets: in M every open set is a 
countable union of disjoint open intervals, while in d > 2, every open 
set is “almost” the disjoint union of closed cubes, in the sense that only 
the boundaries of the cubes can overlap. These two theorems motivate 
the definition of exterior measure given later. 

We shall use the following standard notation. A point x G consists 
of a d-tuple of real numbers 

x = (xi, X 2 , •. •, Xd), Xi G M, for i = 1 ， ...， d. 

Addition of points is componentwise, and so is multiplication by a real 
scalar. The norm of x is denoted by \x\ and is defined to be the standard 
Euclidean norm given by 

H 二 (4 + … +4) 1/2 . 

The distance between two points x and y is then simply \x — y\. 

The complement of a set E in is denoted by E c and defined by 

E c = {x eR d : x ^ E}. 

If E and F are two subsets of we denote the complement of F in E 1 
by 

E — F = {x eR d : x E E and x ^ F}. 

The distance between two sets E and F is defined by 

d(E,F) = mf\x-y\, 

where the infimum is taken over all x E and y 6 F. 

Open, closed, and compact sets 

The open ball in centered at x and of radius r is defined by 
B r {x) = {y eR d : \y — x\ < r}. 

A subset E is open if for every x G E there exists r > 0 with 
B r (x) C E. By definition, a set is closed if its complement is open. 

We note that any (not necessarily countable) union of open sets is 
open, while in general the intersection of only finitely many open sets 
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is open. A similar statement holds for the class of closed sets, if one 
interchanges the roles of unions and intersections. 

A set E is bounded if it is contained in some ball of finite radius. 
A bounded set is compact if it is also closed. Compact sets enjoy the 
Heine-Borel covering property: 

• Assume E is compact, E C lj a and each O a is open. Then 

there are finitely many of the open sets, O ai , O a2 ,. • • ， O aN , such 
that E C IJjLi ^oiy 

In words, any covering of a compact set by a collection of open sets 
contains a finite sub covering. 

A point x G is a limit point of the set E if for every r > 0, the ball 
B r (x) contains points of E. This means that there are points in E which 
are arbitrarily close to x. An isolated point of 五 is a point x ^ E such 
that there exists an r > 0 where B r (x) H E is equal to {x}. 

A point x G is an interior point of E if there exists r > 0 such 
that B r (x) C E. The set of all interior points of E is called the interior 
of E. Also, the closure E of the E consists of the union of E and all 
its limit points. The boundary of a set E, denoted by dE, is the set of 
points which are in the closure of E but not in the interior of E. 

Note that the closure of a set is a closed set; every point in 五 is a 
limit point of E\ and a set is closed if and only if it contains all its limit 
points. Finally, a closed set E is perfect if E does not have any isolated 
points. 

Rectangles and cubes 

A (closed) rectangle R in is given by the product of d one-dimensional 
closed and bounded intervals 

R 二 [ai,6i] x [a 2 ,b 2 ] x ... x [a d ,b d ], 

where aj < bj are real numbers, j = 1,2,..., d. In other words, we have 

R = {(o ： i,..., Xd) G : aj < Xj < bj for all j = 1 ,2, ..., d}. 

We remark that in our definition, a rectangle is closed and has sides 
parallel to the coordinate axis. In M, the rectangles are precisely the 
closed and bounded intervals, while in M 2 they are the usual four-sided 
rectangles. In R 3 they are the closed parallelepipeds. 

We say that the lengths of the sides of the rectangle R are bi — 
ai,... ,bd — dd. The volume of the rectangle R is denoted by |i?|, and 
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R 2 


Figure 1. Rectangles in d = 1,2,3 


is defined to be 

|-R| = (bi — ai) - - - (bd — cid)- 

Of course, when d = 1 the “volume” equals length, and when d = 2 it 
equals area. 

An open rectangle is the product of open intervals, and the interior of 
the rectangle R is then 

(ai ? 6i) x (a 2 ,6 2 ) x … x (a d ， b d ). 

Also, a cube is a rectangle for which bi — ai = b 】一 （12 = ... = bd — ad. 
So if Q C R d is a cube of common side length then \Q\ = £ d . 

A union of rectangles is said to be almost disjoint if the interiors of 
the rectangles are disjoint. 

In this chapter, coverings by rectangles and cubes play a major role, 
so we isolate here two important lemmas. 

Lemma 1.1 If a rectangle is the almost disjoint union of finitely many 
other rectangles, say R = {J^ =1 Rk, then 

N 

网： Ei 叫. 

k=l 
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Proof. We consider the grid formed by extending indefinitely the 
sides of all rectangles i?i,..., Rn- This construction yields finitely many 
rectangles i?i,..., Rm, and a partition Ji,..., Jn of the integers between 
1 and M, such that the unions 


M 

R= \^J Rj and Rj : for fc = 1 ，…， iV 

j=i j^Jk 

are almost disjoint (see the illustration in Figure 2). 

R 








Rm 

























Ri 

R2 






Figure 2. The grid formed by the rectangles 


For the rectangle H, for example, we see that \R\ = |^'|? since 

the grid actually partitions the sides of R and each Rj consists of taking 
products of the intervals in these partitions. Thus when adding the 
volumes of the Rj we are summing the corresponding products of lengths 
of the intervals that arise. Since this also holds for the other rectangles 
i?i,..., Rn, we conclude that 

M N N 

\ R \ = X] I 馬卜 X] 叫 . 

J = 1 k=l jeJk k=l 


A slight modification of this argument then yields the following: 
























6 


Chapter 1. MEASURE THEORY 


Lemma 1.2 If R ， R^, ， Rn are rectangles, and R C U^i then 

N 

i^i <E 

k=l 

The main idea consists of taking the grid formed by extending all sides 
of the rectangles i?, i?i,..., and noting that the sets corresponding 
to the Jk (in the above proof) need not be disjoint any more. 

We now proceed to give a description of the structure of open sets in 
terms of cubes. We begin with the case of M. 

Theorem 1.3 Every open subset O of M. can be writen uniquely as a 
countable union of disjoint open intervals. 

Proof. For each x e O, let I x denote the largest open interval contain¬ 
ing x and contained in O. More precisely, since O is open, x is contained 
in some small (non-trivial) interval, and therefore if 


a x = inf{a < x : (a, x) C O} and b x = sup{6 > x : (x, 6) C O} 

we must have a x < x < b x (with possibly infinite values for a x and b x ). 
If we now let I x = (a x , b x ), then by construction we have x 6 I x as well 
as I x C O. Hence 


o^\Ji x . 

xeo 

Now suppose that two intervals I x and I y intersect. Then their union 
(which is also an open interval) is contained in O and contains x. Since 
I x is maximal, we must have (I x U I y ) C I x , and similarly (I x U I y ) C I y . 
This can happen only if I x = I y \ therefore, any two distinct intervals in 
the collection T = {I x }xeO must be disjoint. The proof will be complete 
once we have shown that there are only countably many distinct intervals 
in the collection I. This, however, is easy to see, since every open interval 
I x contains a rational number. Since different intervals are disjoint, they 
must contain distinct rationals, and therefore X is countable, as desired. 

Naturally, if O is open and O = where the //s are disjoint 

open intervals, the measure of O ought to be 二 i I 心 I. Since this rep¬ 
resentation is unique, we could take this as a definition of measure; we 
would then note that whenever 0\ and O 2 are open and disjoint, the mea¬ 
sure of their union is the sum of their measures. Although this provides 
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a natural notion of measure for an open set, it is not immediately clear 
how to generalize it to other sets in ]R. Moreover, a similar approach in 
higher dimensions already encounters complications even when defining 
measures of open sets, since in this context the direct analogue of The¬ 
orem 1.3 is not valid (see Exercise 12). There is, however, a substitute 
result. 

Theorem 1.4 Every open subset O of d > 1, can be written as a 
countable union of almost disjoint closed cubes. 

Proof. We must construct a countable collection Q of closed cubes 
whose interiors are disjoint, and so that O = UqgQ 

As a first step, consider the grid in M. d formed by taking all closed cubes 
of side length 1 whose vertices have integer coordinates. In other words, 
we consider the natural grid of lines parallel to the axes, that is, the grid 
generated by the lattice Z d . We shall also use the grids formed by cubes 
of side length 2~ N obtained by successively bisecting the original grid. 

We either accept or reject cubes in the initial grid as part of Q accord¬ 
ing to the following rule: if Q is entirely contained in O then we accept 
Q; if Q intersects both O and O c then we tentatively accept it; and if Q 
is entirely contained in O c then we reject it. 

As a second step, we bisect the tentatively accepted cubes into 2 d cubes 
with side length 1/2. We then repeat our procedure, by accepting the 
smaller cubes if they are completely contained in (9, tentatively accepting 
them if they intersect both O and O c , and rejecting them if they are 
contained in O c . Figure 3 illustrates these steps for an open set in M 2 . 



Step 1 Step 2 


Figure 3. Decomposition of O into almost disjoint cubes 
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This procedure is then repeated indefinitely, and (by construction) 
the resulting collection Q of all accepted cubes is countable and consists 
of almost disjoint cubes. To see why their union is all of (9, we note 
that given x E O there exists a cube of side length 2- N (obtained from 
successive bisections of the original grid) that contains x and that is 
entirely contained in O. Either this cube has been accepted, or it is 
contained in a cube that has been previously accepted. This shows that 
the union of all cubes in Q covers O. 

Once again, ii O = U 二 i Rj where the rectangles Rj are almost dis¬ 
joint, it is reasonable to assign to O the measure This is 

natural since the volume of the boundary of each rectangle should be 0, 
and the overlap of the rectangles should not contribute to the volume 
of O. We note, however, that the above decomposition into cubes is 
not unique, and it is not immediate that the sum is independent of this 
decomposition. So in with d > 2, the notion of volume or area, even 
for open sets, is more subtle. 

The general theory developed in the next section actually yields a 
notion of volume that is consistent with the decompositions of open sets 
of the previous two theorems, and applies to all dimensions. Before we 
come to that, we discuss an important example in M. 


The Cantor set 

The Cantor set plays a prominent role in set theory and in analysis in 
general. It and its variants provide a rich source of enlightening examples. 

We begin with the closed unit interval Co = [0,1] and let C\ denote 
the set obtained from deleting the middle third open interval from [0,1], 
that is, 


Ci = [0,1/3] U [2/3,1]. 

Next, we repeat this procedure for each sub-interval of Ci ； that is, we 
delete the middle third open interval. At the second stage we get 


C 2 = [0,1/9] U [2/9,1/3] U [2/3,7/9] U [8/9,1]. 


We repeat this process for each sub-interval of C 2 , and so on (Figure 4). 

This procedure yields a sequence Ck, fc = 0,1,2, … of compact sets 
with 


C 0 D Ci D C 2 D • ■ • D D C k+1 D • ■ ■. 
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C 0 




c 2 


C 3 


I - 

0 

1 

- 1 

1 

1 

1 

0 1/3 

I 

1 

2/3 1 

I 

'0 1/9 2/9 1/3 

1____ 

2/3 7 , 9 8 / 9 I' 

1 


Figure 4. Construction of the Cantor set 


The Cantor set C is by definition the intersection of all C^s: 

OO 

C^f]C k . 

k=0 

The set C is not empty, since all end-points of the intervals in Ck (all k) 
belong to C. 

Despite its simple construction, the Cantor set enjoys many interest¬ 
ing topological and analytical properties. For instance, C is closed and 
bounded, hence compact. Also, C is totally disconnected: given any 
x,y 6 C there exists z 丰 C that lies between x and y. Finally, C is per¬ 
fect: it has no isolated points (Exercise 1). 

Next, we turn our attention to the question of determining the “size” 
of C. This is a delicate problem, one that may be approached from 
different angles depending on the notion of size we adopt. For instance, 
in terms of cardinality the Cantor set is rather large: it is not countable. 
Since it can be mapped to the interval [0,1], the Cantor set has the 
cardinality of the continuum (Exercise 2). 

However, from the point of view of “length” the size of C is small. 
Roughly speaking, the Cantor set has length zero, and this follows from 
the following intuitive argument: the set C is covered by sets Ck whose 
lengths go to zero. Indeed, Ck is a disjoint union of intervals of length 
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3 _fc , making the total length of Ck equal to (2/3) fc . But C C Ck for all 
fc, and (2/3) fc —> 0 as fc tends to infinity. We shall define a notion of 
measure and make this argument precise in the next section. 

2 The exterior measure 

The notion of exterior measure is the first of two important concepts 
needed to develop a theory of measure. We begin with the definition and 
basic properties of exterior measure. Loosely speaking, the exterior mea¬ 
sure m* assigns to any subset of a first notion of size; various examples 
show that this notion coincides with our earlier intuition. However, the 
exterior measure lacks the desirable property of additivity when taking 
the union of disjoint sets. We remedy this problem in the next section, 
where we discuss in detail the other key concept of measure theory, the 
notion of measurable sets. 

The exterior measure, as the name indicates, attempts to describe 
the volume of a set E by approximating it from the outside. The set 
E is covered by cubes, and if the covering gets finer, with fewer cubes 
overlapping, the volume of E should be close to the sum of the volumes 
of the cubes. 

The precise definition is as follows: if E is any subset of the 
exterior measure 1 of E is 

oo 

(1) m^(E) = inf E 阽 I ， 

j=i 

where the infimum is taken over all countable coverings E C U 二 i Qj by 
closed cubes. The exterior measure is always non-negative but could be 
infinite, so that in general we have 0 < m^(E) < oo, and therefore takes 
values in the extended positive numbers. 

We make some preliminary remarks about the definition of the exterior 
measure given by (1). 

(i) It is important to note that it would not suffice to allow finite sums 

in the definition of The quantity that would be obtained if one 

considered only coverings of E by finite unions of cubes is in general 
larger than m 乂 E). (See Exercise 14.) 

(ii) One can, however, replace the coverings by cubes, with coverings 
by rectangles; or with coverings by balls. That the former alternative 


1 Some authors use the term outer measure instead of exterior measure. 
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yields the same exterior measure is quite direct. (See Exercise 15.) The 
equivalence with the latter is more subtle. (See Exercise 26 in Chapter 3.) 

We begin our investigation of this new notion by providing examples 
of sets whose exterior measures can be calculated, and we check that 
the latter matches our intuitive idea of volume (length in one dimension, 
area in two dimensions, etc.) 

Example 1. The exterior measure of a point is zero. This is clear once 
we observe that a point is a cube with volume zero, and which covers 
itself. Of course the exterior measure of the empty set is also zero. 


Example 2. The exterior measure of a closed cube is equal to its volume. 
Indeed, suppose Q is a closed cube in R d . Since Q covers itself, we must 
have m*(Q) < |Q|. Therefore, it suffices to prove the reverse inequality. 

We consider an arbitrary covering Q C Qj by cubes, and note 
that it suffices to prove that 


( 2 ) 


j=i 


For a fixed e > 0 we choose for each j an open cube Sj which contains Qj, 
and such that \Sj\ < (1 + e)\Qj\. From the open covering & of the 
compact set Q, we may select a finite sub covering which, after possibly 
renumbering the rectangles, we may write as Q C (J j=1 Sj. Taking the 
closure of the cubes Sj, we may apply Lemma 1.2 to conclude that \Q\ < 
I I • Consequently, 

N oo 

|Q| < (i + ^) [ \Qj\ < (i + e ) [ \Qj\- 

3=1 j=l 

Since e is arbitrary, we find that the inequality (2) holds; thus |Q| < 
as desired. 

Example 3. If Q is an open cube, the result m*(Q) = \Q\ still holds. 
Since Q is covered by its closure Q, and \Q\ = |Q|, we immediately see 
that m^(Q) < \Q\. To prove the reverse inequality, we note that if Qo is 
a closed cube contained in Q, then m*((3o) $ m*(Q), since any covering 
of Q by a countable number of closed cubes is also a covering of Qo (see 
Observation 1 below). Hence |Qo| 幺 m*(Q), and since we can choose Qo 
with a volume as close as we wish to |Q|, we must have \Q\ < m*(Q). 
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Example 4. The exterior measure of a rectangle R is equal to its volume. 
Indeed, arguing as in Example 2, we see that \R\ < To obtain the 

reverse inequality, consider a grid in M. d formed by cubes of side length 
1/k. Then, if Q consists of the (finite) collection of all cubes entirely 
contained in i?, and Q! the (finite) collection of all cubes that intersect 
the complement of i?, we first note that R C UqgCQuqo Also, a simple 
argument yields 

LlQI 引 i?|. 

QeQ 

Moreover, there are 0(fc d_1 ) cubes 2 in Q’，and these cubes have volume 
k~ d , so that 1^1 ~ 0(1/k). Hence 

E \Q\<\R\ + o(i/k), 

QG(QUQ0 

and letting k tend to infinity yields m^(R) < |i?|, as desired. 

Example 5. The exterior measure of is infinite. This follows from 
the fact that any covering of M. d is also a covering of any cube Q C 
hence |Q| < Since Q can have arbitrarily large volume, we must 

have m*(R d ) = oo. 


Example 6. The Cantor set C has exterior measure 0. From the con¬ 
struction of C, we know that C C Cfc, where each Ck is a disjoint union 
of 2 k closed intervals, each of length 3_ k . Consequently, m*(C) < (2/3) k 
for all fc, hence m*(C) = 0. 

Properties of the exterior measure 

The previous examples and comments provide some intuition underlying 
the definition of exterior measure. Here, we turn to the further study of 
m* and prove five properties of exterior measure that are needed in what 
follows. 

First, we record the following remark that is immediate from the def¬ 
inition of m*: 


2 We remind the reader of the notation f(x) = 0(g(x)), which means that \f(x)\ < 
C\g(x)\ for some constant C and all x in a given range. In this particular example, there 
are fewer than 1 cubes in question, as — oo. 
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• For every e > 0, there exists a covering E C Ujli Qj with 

oo 

+ e. 
j=i 

The relevant properties of exterior measure are listed in a series of 
observations. 

Observation 1 (Monotonicity) If Ei C E 2 , then m^(Ei) < 

This follows once we observe that any covering of E 2 by a countable 
collection of cubes is also a covering of E\. 

In particular, monotonicity implies that every bounded subset of 
has finite exterior measure. 

Observation 2 (Countable sub-additivity) If E — (J^ =1 Ej, then 
m^(E) < 

First, we may assume that each < 00 , for otherwise the in¬ 

equality clearly holds. For any e > 0, the definition of the exterior mea¬ 
sure yields for each j a covering Ej C U==i Qk,j by closed cubes with 

00 

^ \Qk,j\ < + —. 

k=l 

Then, E C [j^ k=1 Qkj is a covering of E by closed cubes, and therefore 

00 00 

m^(E) < IQfcjl = I 

j,k j=l k=l 

j=i 

00 

3=1 

Since this holds true for every e > 0, the second observation is proved. 

Observation S If E C M d ，then = inf m^((D), where the infi- 

mum is taken over all open sets O containing E. 
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By monotonicity, it is clear that the inequality m^(E) < inf m^{0) 
holds. For the reverse inequality, let e > 0 and choose cubes Qj such 
that E C Ujli Qj^ with 

oo 

^ \Qj\ ^ rn^(E) + 

j=i 


Let Q°j 
e/2^\ 


denote an open cube containing Qj, and such that |Q^| < \Qj \ + 
Then O = Qj ls open, and by Observation 2 


m4O)<J2m4Q 0 j) = Ei 呦 

j=i j=i 


< 



oo 

^ ^2 l^'l + 2 

< m^(E) + e. 


Hence inf m^{(D) < m^(E) : as was to be shown. 

Observation 4 If E = E\\J E 2 , and E 2 ) > 0, then 
m^(E) = m^(Ei) + m* ( 五 2). 

By Observation 2, we already know that m^(E) < m^(Ei) + m* ( 五 2 )， 
so it suffices to prove the reverse inequality. To this end, we first select 5 
such that d{Ei^E 2 ) > 5 > 0. Next, we choose a covering E C IJ^li Qj by 
closed cubes, with \Qj\ — + e - We may, after subdividing 

the cubes Qj, assume that each Qj has a diameter less than 5. In this 
case, each Qj can intersect at most one of the two sets Ei or E 2 . If we 
denote by and J 2 the sets of those indices j for which Qj intersects 
Ei and £" 2 , respectively, then Ji fl J 2 is empty, and we have 


c 1J Qj 

jGJl 


as well as 


E 2 C [j Qj. 
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Therefore, 

+ m*(£ ， 2) < 〉: \Qj \ + \Qj\ 

j^Jl j^J2 

oo 

J=1 

< m*(E) + e. 

Since e is arbitrary, the proof of Observation 4 is complete. 

Observation 5 If a set E is the countable union of almost disjoint cubes 
E = Ujli Qjy then 

oo 

m^(E) = 〉: IQjI• 

j=i 

Let Qj denote a cube strictly contained in Qj such that \Qj\ < \Qj \ + 
e/2- 7 , where e is arbitrary but fixed. Then, for every N ， the cubes 
Qi, Q 2 ,.. •, Qiv are disjoint, hence at a finite distance from one another, 
and repeated applications of Observation 4 imply 

( n \ N N 

U4- 

0=1 J j=i j=i 

Since U 二 1 Qj C E, we conclude that for every integer N ， 

N 

m^(E) >^2\Qj\ -e. 

j=i 

In the limit as N tends to infinity we deduce \Qj\ — 爪 *(E) + e 

for every e > 0, hence \Qj\ — Therefore, combined with 

Observation 2, our result proves that we have equality. 

This last property shows that if a set can be decomposed into almost 
disjoint cubes, its exterior measure equals the sum of the volumes of the 
cubes. In particular, by Theorem 1.4 we see that the exterior measure of 
an open set equals the sum of the volumes of the cubes in a decomposi¬ 
tion, and this coincides with our initial guess. Moreover, this also yields 
a proof that the sum is independent of the decomposition. 
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One can see from this that the volumes of simple sets that are cal¬ 
culated by elementary calculus agree with their exterior measure. This 
assertion can be proved most easily once we have developed the requisite 
tools in integration theory. (See Chapter 2.) In particular, we can then 
verify that the exterior measure of a ball (either open or closed) equals 
its volume. 

Despite observations 4 and 5, one cannot conclude in general that if 
Ei U £^2 is a disjoint union of subsets of then 

(3) m^(E 1 U E 2 ) = m^(E 1 ) + m* ( 五 2 ). 

In fact (3) holds when the sets in question are not highly irregular or 
“pathological” but are measurable in the sense described below. 

3 Measurable sets and the Lebesgue measure 

The notion of measurability isolates a collection of subsets in M. d for 
which the exterior measure satisfies all our desired properties, including 
additivity (and in fact countable additivity) for disjoint unions of sets. 

There are a number of different ways of defining measurability, but 
these all turn out to be equivalent. Probably the simplest and most 
intuitive is the following: A subset E of is Lebesgue measurable, 
or simply measurable, if for any e > 0 there exists an open set O with 
E C O and 


m^(0 — E) < e. 

This should be compared to Observation 3, which holds for all sets E. 

If E is measurable, we define its Lebesgue measure (or measure) 
m{E) by 

m(E) = 

Clearly, the Lebesgue measure inherits all the features contained in Ob¬ 
servations 1 - 5 of the exterior measure. 

Immediately from the definition, we find: 

Property 1 Every open set in is measurable. 

Our immediate goal now is to gather various further properties of 
measurable sets. In particular, we shall prove that the collection of 
measurable sets behave well under the various operations of set theory: 
countable unions, countable intersections, and complements. 
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Property 2 If m#(E) = 0, then E is measurable. In particular, if F is 
a subset of a set of exterior measure 0, then F is measurable. 

By Observation 3 of the exterior measure, for every e > 0 there ex¬ 
ists an open set O with E C O and m^(O) < e. Since (O — E) C O, 
monotonicity implies m^(0 — 五 ） < e, as desired. 

As a consequence of this property, we deduce that the Cantor set C in 
Example 6 is measurable and has measure 0. 

Property 3 A countable union of measurable sets is measurable. 

Suppose E = Ej, where each Ej is measurable. Given e > 0, we 
may choose for each j an open set Oj with Ej C Oj and 
m^(0j — Ej) < e/2- 7 . Then the union O = A i s 0 P en ，E C O, and 
(O — E) C — Ej), so monotonicity and sub-additivity of the 

exterior measure imply 


oo 

771*(0 — - £/) ^ E m^Oj-Ej) < e. 

j=i 

Property 4 Closed sets are measurable. 

First, we observe that it suffices to prove that compact sets are mea¬ 
surable. Indeed, any closed set F can be written as the union of compact 
sets, say F = where Bk denotes the closed ball of radius k 

centered at the origin; then Property 3 applies. 

So, suppose F is compact (so that in particular m^(F) < oo), and let 
e > 0. By Observation 3 we can select an open set O with F C O and 
m^{0) < m*(F) + e_ Since F is closed, the difference O — F is open, 
and by Theorem 1.4 we may write this difference as a countable union 
of almost disjoint cubes 


o-f^Uq^ 

j=l 

For a fixed N, the finite union K = U&1 Qj is compact; therefore 
d(K^ F) > 0 (we isolate this little fact in a lemma below). Since (K U 
F) C (9, Observations 1, 4, and 5 of the exterior measure imply 

m^(O) > m*(_F) + m^(K) 

N 

=m*(F) + 
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Hence ^ tyi^{ 0) — m^(F) < e, and this also holds in the 

limit as N tends to infinity. Invoking the sub-additivity property of the 
exterior measure finally yields 


oo 

m^(0 - F) < < e, 

j=i 


as desired. 

We digress briefly to complete the above argument by proving the 
following. 

Lemma 3.1 If F is closed, K is compact, and these sets are disjoint, 
then d[F ， K) > 0. 

Proof. Since F is closed, for each point x 6 K, there exists 心 > 0 so 
that d(x : F) > 35 x . Since [J xeK B 25 x (x) covers K, and K is compact, we 
may find a subcover, which we denote by IJ 二 i If we let 5 = 

min(5i,..., 5n), then we must have d[K, F) > 6 > 0. Indeed, if x E K 
and y E F : then for some j we have \xj — x\ < 26 and by construction 
\y — ^j\ ^ 35j. Therefore 

\y - x\>\y - Xj\ - \xj - x\> 36j - 25j > S, 


and the lemma is proved. 


Property 5 The complement of a measurable set is measurable. 

If E is measurable, then for every positive integer n we may choose an 
open set O n with E C O n and m^(O n — E) < 1/n. The complement 
is closed, hence measurable, which implies that the union S = (J n=1 
is also measurable by Property 3. Now we simply note that S C E c , and 

(E c -S)c(O n -E), 

such that m 乂 E c — S) < 1/n for all n. Therefore, m*(E c — 5) = 0, and 
E c — S is measurable by Property 2. Therefore E c is measurable since 
it is the union of two measurable sets, namely S and (E c — S). 

Property 6 A countable intersection of measurable sets is measurable. 

This follows from Properties 3 and 5, since 
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In conclusion, we find that the family of measurable sets is closed under 
the familiar operations of set theory. We point out that we have shown 
more than simply closure with respect to finite unions and intersections: 
we have proved that the collection of measurable sets is closed under 
countable unions and intersections. This passage from finite operations 
to infinite ones is crucial in the context of analysis. We emphasize, how¬ 
ever, that the operations of uncountable unions or intersections are not 
permissible when dealing with measurable sets! 


Theorem 3.2 If Ei, E〗， …， are disjoint measurable sets, and E = 

Ujli E jy then 

oo 

m ( E ) = ^2 m ㈣ • 


Proof. First, we assume further that each Ej is bounded. Then, for 


each j, by applying the definition of measurability to E^, we can choose 
a closed subset Fj of Ej with m 八 Ej — Fj) < e/2- 7 . For each fixed N, 


the sets Fi,..., are compact and disjoint, so that m 



- Since []> C E, 


we must have 


N N 

m(E) > ^2 m ( p j) > ^2 m ( E j) - e - 
j=i j=i 

Letting N tend to infinity, since e was arbitrary we find that 

oo 

m(E) > E m ⑹. 


Since the reverse inequality always holds (by sub-additivity in Observa¬ 
tion 2), this concludes the proof when each Ej is bounded. 

In the general case, we select any sequence of cubes {Qk}^=i that 
increases to in the sense that Qk C Qfc+i for all fc > 1 and Qk = 
M. d . We then let S\ = Q\ and Sk = Qk ~ Qk-i for k >2. If we define 
measurable sets by Ej,k = Ej fl Sk, then 

■E = U 马 , fc. 

j,k 

The union above is disjoint and every Ej^ is bounded. Moreover Ej = 
Ej,k, and this union is also disjoint. Putting these facts together, 
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and using what has already been proved, we obtain 

m ⑻二 E m ( E : j， k ) = ^ 5Z m ( E j) 

j，k j k j 


as claimed. 

With this, the countable additivity of the Lebesgue measure on mea¬ 
surable sets has been established. This result provides the necessary 
connection between the following: 

• our primitive notion of volume given by the exterior measure, 

• the more refined idea of measurable sets, and 


• the countably infinite operations allowed on these sets. 


We make two definitions to state succinctly some further consequences. 
If 五 1 , 五 2 , • • • is a countable collection of subsets of that increases 
to E in the sense that Ek C for all fc, and E = IJfcLi then we 

write Ek / E. 

Similarly, if 五 i, 五 2 ,… decreases to E in the sense that Ek D Ek+i for 
all fc, and E = p|^ 1 Ek, we write Ek \ E. 

Corollary 3.3 Suppose ， ... are measurable subsets of 

(i) If Ek / E, then m(E) = limiv^oo 

(ii) If Ek\ E and m(Ej^) < 00 for some k, then 

m(E) = lim m(Ejsf). 

AT—>-oo 

Proof. For the first part, let G\ = Ei^ G 2 = E 2 — E !， and in gen¬ 
eral Gk = Ek — -Efc-i for k >2. By their construction, the sets Gk are 
measurable, disjoint, and E = IJ==i Gk. Hence 

00 N / N 

m(E) = m(Gfc) = lim rn(Gk) = lim m [ I I Gk 

’ 』 iV—>oo ’ 』 AT—>-oo \ 

k=l k=l \fc=l 

and since |J^ 1 Gk = En we get the desired limit. 

For the second part, we may clearly assume that m(Ei) < 00 . Let 
Gk = Ek — Ek~\~i for each fc, so that 



E^EuljGk 
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is a disjoint union of measurable sets. As a result, we find that 

N-l 

m(E 1 ) = m(E) + lim (m(E k ) - m(E k+1 )) 

AT—>oo ^ 

k=l 

=m(E') + m(Ei) — lim m(Ejsf). 

N—oo 

Hence, since m(Ei) < oo, we see that m(E) = lim^^oo and the 

proof is complete. 

The reader should note that the second conclusion may fail without 
the assumption that m(Ek) < oo for some k. This is shown by the simple 
example when E n = (n, oo) C M, for all n. 

What follows provides an important geometric and analytic insight 
into the nature of measurable sets, in terms of their relation to open and 
closed sets. Its thrust is that, in effect, an arbitrary measurable set can 
be well approximated by the open sets that contain it, and alternatively, 
by the closed sets it contains. 

Theorem 3.4 Suppose E is a measurable subset Then, for every 

e > 0: 

(i) There exists an open set O with E C O and m(0 — E) < e. 

(ii) There exists a closed set F with F C E and m(E — F) < e. 

(iii) If m(E) is finite, there exists a compact set K with K G E and 
m(E — K) < e. 

(iv) If m(E) is finite，there exists a finite union F = U 二 i Qj of closed 
cubes such that 


m(EAF) < e. 


The notation EAF stands for the symmetric difference between the 
sets E and F, defined by EAF = (E _ F) U (F _ E), which consists of 
those points that belong to only one of the two sets E or F. 

Proof. Part (i) is just the definition of measurability. For the second 
part, we know that E c is measurable, so there exists an open set O with 
E c C O and m(0 — E c ) < e. If we let F = O c , then F is closed, F C E, 
and E — F = O — E c . Hence m(E — F) < e as desired. 

For (iii), we first pick a closed set F so that F C E and m(E — F) < 
e/2. For each n, we let B n denote the ball centered at the origin of radius 
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n, and define compact sets K n = F D B n . Then E — K n is a sequence 
of measurable sets that decreases to E — F, and since m(E) < oo, we 
conclude that for all large n one has m(E — K n ) < e. 

For the last part, choose a family of closed cubes {Qj}j^=i so that 

oo oo 

E C Qj and \Qj\ < m(E) + e/2. 

j=i j=i 


Since m(E) < oo, the series converges and there exists N > Q such that 
E^iv+i l^'l < e/2- If P = Uf = iQ,'，then 

m(EAF) = m(E — F) m(F — E) 



^ i^ji + \Qo\ ~ m ( E ) 

j=N+l j=l 

< e. 


Invariance properties of Lebesgue measure 

A crucial property of Lebesgue measure in M. d is its translation-invariance, 
which can be stated as follows: if 五 is a measurable set and h G then 
the set Eh = E h = {x h : x ^ E} is also measurable, and m(E + 
h) = m(E). With the observation that this holds for the special case 
when 五 is a cube, one passes to the exterior measure of arbitrary sets 
E, and sees from the definition of m* given in Section 2 that m^(Eh)= 
m^(E). To prove the measurability of Eh under the assumption that E 
is measurable, we note that if O is open, O D and m^{0 — E) < e, 
then Oh is open, Oh D Eh, and m^(Oh — Eh) < e. 

In the same way one can prove the relative dilation-invariance of 
Lebesgue measure. Suppose 5 > 0, and denote by 5E the set {5x : 
x G E}. We can then assert that 6E is measurable whenever E is, 
and m(6E) = S d m(E). One can also easily see that Lebesgue mea¬ 
sure is reflection-invariant. That is, whenever E is measurable, so is 
—E = {—x : x E E} and m(—E) = m(E). 

Other invariance properties of Lebesgue measure are in Exercise 7 
and 8, and Problem 4 of Chapter 2. 
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cr-algebras and Borel sets 

A cr-algebra of sets is a collection of subsets of that is closed under 
countable unions, countable intersections, and complements. 

The collection of all subsets of is of course a cr-algebra. A more 
interesting and relevant example consists of all measurable sets in 
which we have just shown also forms a <j- algebra. 

Another a-algebra, which plays a vital role in analysis, is the Borel 
a- algebra in denoted by which by definition is the smallest a- 

algebra that contains all open sets. Elements of this <j- algebra are called 
Borel sets. 

The definition of the Borel a-algebra will be meaningful once we have 
defined the term “smallest,” and shown that such a cr-algebra exists and 
is unique. The term “smallest” means that if S is any cr-algebra that 
contains all open sets in then necessarily B^d C S. Since we observe 
that any intersection (not necessarily countable) of cr-algebras is again a 
cr-algebra, we may define 谷 股 d as the intersection of all a-algebras that 
contain the open sets. This shows the existence and uniqueness of the 
Borel cr-algebra. 

Since open sets are measurable, we conclude that the Borel a-algebra 
is contained in the <j- algebra of measurable sets. Naturally, we may ask 
if this inclusion is strict: do there exist Lebesgue measurable sets which 
are not Borel sets? The answer is “yes.” (See Exercise 35.) 

From the point of view of the Borel sets, the Lebesgue sets arise as 
the completion of the cr-algebra of Borel sets, that is, by adjoining all 
subsets of Borel sets of measure zero. This is an immediate consequence 
of Corollary 3.5 below. 

Starting with the open and closed sets, which are the simplest Borel 
sets, one could try to list the Borel sets in order of their complexity. Next 
in order would come countable intersections of open sets; such sets are 
called Gs sets. Alternatively, one could consider their complements, the 
countable union of closed sets, called the sets. 3 

Corollary 3. 5 A subset E of W 1 is measurable 

(i) if and only if E differs from a Gs by a set of measure zero, 

(ii) if and only if E differs from an F a by a set of measure zero. 

Proof. Clearly E is measurable whenever it satisfies either (i) or (ii), 
since the F a , Gs ： and sets of measure zero are measurable. 


3 The terminology Gs comes from German “Gebiete” and “Durschnitt” ； F a comes from 
French “ferme” and “somme.” 
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Conversely, if E is measurable, then for each integer n > 1 we may 
select an open set O n that contains 五 , and such that m(O n — E) < l/n. 
Then S = is a Gs that contains and (S — E) C (O n — E) 

for all n. Therefore m(S — E) < l/n for all n; hence S — E has exterior 
measure zero, and is therefore measurable. 

For the second implication, we simply apply part (ii) of Theorem 3.4 
with e = l/n, and take the union of the resulting closed sets. 


Construction of a non-measurable set 

Are all subsets of measurable? In this section, we answer this question 
when d = 1 by constructing a subset of M which is not measurable. 4 
This justifies the conclusion that a satisfactory theory of measure cannot 
encompass all subsets of ]R. 

The construction of a non-measurable set J\f uses the axiom of choice, 
and rests on a simple equivalence relation among real numbers in [0,1]. 

We write x ~ y whenever x — y is rational, and note that this is an 
equivalence relation since the following properties hold: 

• x ^ x for every x E [0,1] 

• if a : 〜仏 then y 〜 x 

• if t 〜 y and y 〜 z 、then x ~ z. 

Two equivalence classes either are disjoint or coincide, and [0,1] is the 
disjoint union of all equivalence classes, which we write as 

[ 0 , 1 ] = 

a 

Now we construct the set J\f by choosing exactly one element x a from 
each Eqc, and setting J\f = {〜}• This (seemingly obvious) step requires 
further comment, which we postpone until after the proof of the following 
theorem. 

Theorem 3.6 The set J\f is not measurable. 

The proof is by contradiction, so we assume that J\f is measurable. Let 
{rfc}^ =1 be an enumeration of all the rationals in [—1,1], and consider 
the translates 

A4 = A/* + rfc. 


4 The existence of such a set in R implies the existence of corresponding non-measurable 
subsets of for each d, as a consequence of Proposition 3.4 in the next chapter. 
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We claim that the sets A/fc are disjoint, and 

oo 

⑷ [0,1] C |J A/"fc c [-1,2], 

k=l 

To see why these sets are disjoint, suppose that the intersection 
A4 H Mk' is non-empty. Then there exist rationals ^ r' k and a and 
(3 with x a rk = xp rk'\ hence 

^ol — ^(3 = — ^k- 

Consequently a ^ /3 and x a — xp is rational; hence x a ^ xp, which con¬ 
tradicts the fact that J\f contains only one representative of each equiv¬ 
alence class. 

The second inclusion is straightforward since each A4 is contained in 
[—1,2] by construction. Finally, if x G [0,1], then x 〜 x a for some a, and 
therefore x — x a = rk for some k. Hence x G A4, and the first inclusion 
holds. 

Now we may conclude the proof of the theorem. If J\f were measurable, 
then so would be A4 for all fc, and since the union \^}^ =1 Afk is disjoint, 
the inclusions in (4) yield 

oo 

1 < y^m(A4) <3. 

k=l 

Since A4 is a translate of A/*, we must have m(A4) = for all k. 

Consequently, 

oo 

1 < < 3. 

k=l 

This is the desired contradiction, since neither = 0 nor m{M) > 0 

is possible. 

Axiom of choice 

That the construction of the set J\f is possible is based on the following 
general proposition. 

• Suppose E 1 is a set and {E a } is a collection of non-empty subsets 
of E. (The indexing set of a’s is not assumed to be countable.) 
Then there is a function a x a (a “choice function”）such that 
x a G E a： for all a. 


26 


Chapter 1. MEASURE THEORY 


In this general form this assertion is known as the axiom of choice. 
This axiom occurs (at least implicitly) in many proofs in mathematics, 
but because of its seeming intuitive self-evidence, its significance was 
not at first understood. The initial realization of the importance of 
this axiom was in its use to prove a famous assertion of Cantor, the 
well-ordering principle. This proposition (sometimes referred to as 
“transfinite induction”）can be formulated as follows. 

A set E is linearly ordered if there is a binary relation < such that: 

(a) x < x for all x E E. 

(b) If x, 2 / G -B are distinct, then either x < y or y < x (but not both). 

(c) If x < y and y < z, then x < z. 

We say that a set E can be well-ordered if it can be linearly ordered in 
such a way that every non-empty subset A C E has a smallest element 
in that ordering (that is, an element xo E A such that xq < x for any 
other x E A). 

A simple example of a well-ordered set is Z+, the positive integers with 
their usual ordering. The fact that Z+ is well-ordered is an essential part 
of the usual (finite) induction principle. More generally, the well-ordering 
principle states: 

• Any set E can be well-ordered. 

It is in fact nearly obvious that the well-ordering principle implies the 
axiom of choice: if we well-order E, we can choose x a to be the smallest 
element in E a , and in this way we have constructed the required choice 
function. It is also true, but not as easy to show, that the converse impli¬ 
cation holds, namely that the axiom of choice implies the well-ordering 
principle. (See Problem 6 for another equivalent formulation of the Ax¬ 
iom of Choice.) 

We shall follow the common practice of assuming the axiom of choice 
(and hence the validity of the well-ordering principle). 5 However, we 
should point out that while the axiom of choice seems self-evident the 
well-ordering principle leads quickly to some baffling conclusions: one 
only needs to spend a little time trying to imagine what a well-ordering 
of the reals might look like! 


5 It can be proved that in an appropriate formulation of the axioms of set theory, the 
axiom of choice is independent of the other axioms; thus we are free to accept its validity. 
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4 Measurable functions 

With the notion of measurable sets in hand, we now turn our attention 
to the objects that lie at the heart of integration theory: measurable 
functions. 

The starting point is the notion of a characteristic function of a set 
E, which is defined by 


Xe{x)= 


0 


if x E, 
\i x ^ E. 


The next step is to pass to the functions that are the building blocks of 
integration theory. For the Riemann integral it is in effect the class of 
step functions, with each given as a finite sum 


N 

⑸ / 二 

k=l 

where each Rk is a rectangle, and the ak are constants. 

However, for the Lebesgue integral we need a more general notion, as 
we shall see in the next chapter. A simple function is a finite sum 

N 

⑹ / = akXE ^ 

k=l 

where each is a measurable set of finite measure, and the are 
constants. 


4.1 Definition and basic properties 

We begin by considering only real-valued functions / on which we 
allow to take on the infinite values +oo and —oo, so that f(x) belongs 
to the extended real numbers 

—oo < f(x) < oo. 

We shall say that / is finite-valued if —oo < f(x) < oo for all x. In 
the theory that follows, and the many applications of it, we shall almost 
always find ourselves in situations where a function takes on infinite 
values on at most a set of measure zero. 
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A function / defined on a measurable subset E of is measurable, 
if for all a G M, the set 

/ _1 ([-oo,a)) ^ {x eE : f(x) < a) 

is measurable. To simplify our notation, we shall often denote the set 
{x ^ E : f(x) < a} simply by {/ < a} whenever no confusion is possible. 


First, we note that there are many equivalent definitions of measurable 
functions. For example, we may require instead that the inverse image of 
closed intervals be measurable. Indeed, to prove that / is measurable if 
and only if {x : f(x) < a} = {f < a} is measurable for every a, we note 
that in one direction, one has 


{/<«}= P) {/ < a + 1/A;}, 

k=l 

and recall that the countable intersection of measurable sets is measur¬ 
able. For the other direction, we observe that 

oo 

{/ < a} = y {/ < a - 1/A:}. 

k=l 

Similarly, / is measurable if and only if {/ > a} (or {/ > a}) is measur¬ 
able for every a. In the first case this is immediate from our definition 
and the fact that {/ > a] is the complement of {/ < a}, and in the sec¬ 
ond case this follows from what we have just proved and the fact that 
{/<«} = {/> «} c - A simple consequence is that —/ is measurable 
whenever / is measurable. 

In the same way, one can show that if / is finite-valued, then it is 
measurable if and only if the sets {a < f < b} are measurable for every 
a, 6 G IR. Similar conclusions hold for whichever combination of strict or 
weak inequalities one chooses. For example, if / is finite-valued, then it 
is measurable if and only if {a < f < b} for all a, 6 G M. By the same 
arguments one sees the following: 

Property 1 The finite-valued function f is measurable if and only if 
/ _1 ((9) is measurable for every open set O, and if and only if f~ 1 (F) is 
measurable for every closed set F. 

Note that this property also applies to extended-valued functions, if we 
make the additional hypothesis that both / _1 (oo) and / _1 (—oo) are 
measurable sets. 
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Property 2 If f is continuous on then f is measurable. If f is mea¬ 
surable and finite-valued, and $ is continuous, then 屯 o f is measurable. 

In fact, $ is continuous, so oo, a)) is an open set (9, and hence 

($ o /) _1 ((—oo, a)) = / _1 ((9) is measurable. 

It should be noted, however, that in general it is not true that 
/ o $ is measurable whenever / is measurable and 少 is continuous. See 
Exercise 35. 

Property 3 Suppose {/ n }^=i is a sequence of measurable functions. 
Then 

sup/ n (x), inf / n (x), lim sup,/ n (x) and lim inf f n (x) 

n 71 n — >-oo n ~ ^ 00 

are measurable. 

Proving that sup n f n is measurable requires noting that {sup n f n >a} = 
〉 a }. This also yields the result for inf n f n (x), since this quantity 
equals - sup n (-/ n (x)). 

The result for the limsup and liminf also follows from the two obser¬ 
vations 

lim sup f n (x) = inf {sup f n } and liminf f n (x) = sup{ inf f n }. 

n ― >-oo k n>k n ~ ^ 00 k 

Property 4 If {/ n }^ =1 is a collection of measurable functions, and 

lim f n (x) = /(x )， 

n—^oo 

then f is measurable. 

Since f(x) = limsup^^ f n (x) = liminfn^oo f n (x), this property is a 
consequence of property 3. 

Property 5 If f and g are measurable, then 

(i) The integer powers f k , k > 1 are measurable. 

(ii) f -\- g and fg are measurable if both f and g are finite-valued. 

For (i) we simply note that if k is odd, then {f k > a} = {f > a" fc }, and 
if k is even and a > 0, then {f k > a} = {f > a 1 ^} U {/ < —a" fc }. 

For (ii), we first see that f g is measurable because 

{/ + 5 > a} = [J{f > a-r}n{g>r}, 
reQ 
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with Q denoting the rationals. 

Finally, fg is measurable because of the previous results and the fact 
that 

fg = ^[(/ + q ) 2 ~ U ~ d) 2 ]- 

We shall say that two functions / and g defined on a set E are equal 
almost everywhere, and write 

f(x)= g(x) a.e. x G E, 

if the set {x e E : f(x) ^ g{x)} has measure zero. We sometimes ab¬ 
breviate this by saying that f = g a.e. More generally, a property or 
statement is said to hold almost everywhere (a.e.) if it is true except on 
a set of measure zero. 

One sees easily that if / is measurable and f = g a.e., then g is measur¬ 
able. This follows at once from the fact that {/ < a} and {g < a} differ 
by a set of measure zero. Moreover, all the properties above can be re¬ 
laxed to conditions holding almost everywhere. For instance, if 
is a collection of measurable functions, and 

lim f n (x) 二 f(x) a.e., 

n—^oo 


then / is measurable. 

Note that if / and g are defined almost everywhere on a measurable 
subset E C M d , then the functions f -\- g and fg can only be defined on 
the intersection of the domains of / and g. Since the union of two sets of 
measure zero has again measure zero, f g is defined almost everywhere 
on E. We summarize this discussion as follows. 

Property 6 Suppose f is measurable, and f(x) = g(x) for a.e. x. Then 
g is measurable. 

In this light, Property 5 (ii) also holds when / and g are finite-valued 
almost everywhere. 

4.2 Approximation by simple functions or step functions 

The theorems in this section are all of the same nature and provide 
further insight in the structure of measurable functions. We begin by 
approximating point wise, non-negative measurable functions by simple 
functions. 
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Theorem 4.1 Suppose f is a non-negative measurable function on M. d . 
Then there exists an increasing sequence of non-negative simple functions 
{(fk}kLi that converges pointwise to f, namely, 

^Pk{x) < ^Pk+i{x) and lim = /(x), for all x. 

fc—>oo 

Proof. We begin first with a truncation. For > 1, let Qjy denote 
the cube centered at the origin and of side length N. Then we define 

(f(x) if x E Qjy and f(x) < N, 

Fjsf(x) = < N if x G Qat and f{x) > N, 

I 0 otherwise. 

Then, F]\[(x) —>• f(x) as N tends to infinity for all x. Now, we partition 
the range of F/v, namely [0, iV], as follows. For fixed iV, M > 1, we define 

Ee,M ^ e Q N ： < F n (x) < ； for 0 < £ < NM. 

Then we may form 

Fn,m( X ) XE e , M (x). 

I 

Each Fjst^m is a simple function that satisfies 0 < Fjsr{x) — ⑷幺 
1/M for all x. If we now choose N = M = 2 k with k > 1 integral, and 

let = -p 2 fc , 2 fe ? then we see that 0 < Fm(x) — < l/2 k for all x, 

{(fk} is increasing, and this sequence satisfies all the desired properties. 

Note that the result holds for non-negative functions that are extended¬ 
valued, if the limit +oo is allowed. We now drop the assumption that / 
is non-negative, and also allow the extended limit —oo. 

Theorem 4.2 Suppose f is measurable on M d . Then there exists a se¬ 
quence of simple functions {(pk}^Li that satisfies 

Wk{x)\ < \ip k +i(x)\ and lim (fk(x) = /(x), for all x. 

k^oo 

In particular, we have \ifk(x)\ < \ f(x)\ for all x and k. 

Proof. We use the following decomposition of the function /: f(x) = 
f + (x) — where 

/ + (x) = max(/(x), 0) and / _ (x) = max(—/(x), 0). 
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Since both /+ and f~ are non-negative, the previous theorem yields 
two increasing sequences of non-negative simple functions {^\x)}^ =1 
and which converge pointwise to /+ and / _ , respectively. 

Then, if we let 

wo 卜 ㈤ ⑷ - ※⑷， 

we see that converges to f(x) for all x. Finally, the sequence {| 仰 |} 

is increasing because the definition of / f~ and the properties of 
and imply that 

|仰⑷ I ^ ^ ( x ) + ^ ( x ) - 


We may now go one step further, and approximate by step functions. 
Here, in general, the convergence may hold only almost everywhere. 

Theorem 4.3 Suppose f is measurable on Then there exists a se¬ 
quence of step functions {4k}h that converges pointwise to f{x) for 
almost every x. 

Proof. By the previous result, it suffices to show that if 五 is a 
measurable set with finite measure, then / = xe can be approximated 
by step functions. To this end, we recall part (iv) of Theorem 3.4, 
which states that for every e there exist cubes Qu … ， Qn such that 
m(EA IJ^Li Qj) — e - By considering the grid formed by extending the 
sides of these cubes, we see that there exist almost disjoint rectangles 
, Rm such that U^Li Qj = U^ii By taking rectangles Rj con¬ 
tained in Rj, and slightly smaller in size, we find a collection of disjoint 
rectangles that satisfy m(EA [jjLi Rj) S 2e. Therefore 

M 

f( x ) = Y1 xr ^ x ^ 

except possibly on a set of measure < 2e. Consequently, for every > 1, 
there exists a step function ^(x) such that if 

E k = {x : f(x) ^ 4k(x)}, 

then m(Ek) < 2~ k . If we let Fk = Ujlic+i and F = P|^ =1 Fk, then 
m(F) = 0 since m(F^) < T ~ K 、and ^(x) —>• f(x) for all x in the com¬ 
plement of F, which is the desired result. 
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4.3 Littlewood’s three principles 

Although the notions of measurable sets and measurable functions rep¬ 
resent new tools, we should not overlook their relation to the older con¬ 
cepts they replaced. Littlewood aptly summarized these connections in 
the form of three principles that provide a useful intuitive guide in the 
initial study of the theory. 

(i) Every set is nearly a finite union of intervals. 

(ii) Every function is nearly continuous. 

(iii) Every convergent sequence is nearly uniformly convergent. 

The sets and functions referred to above are of course assumed to 
be measurable. The catch is in the word “nearly,” which has to be 
understood appropriately in each context. A precise version of the first 
principle appears in part (iv) of Theorem 3.4. An exact formulation of 
the third principle is given in the following important result. 

Theorem 4.4 (Egorov) Suppose {fk}^=i is a sequence of measurable 
functions defined on a measurable set E with m(E) < oo, and assume 
that fk^-f a.e on E. Given e > 0, we can find a closed set A e C E 
such that m(E — A e ) < e and fk^f uniformly on A e . 

Proof. We may assume without loss of generality that /^(x) —>• f(x) 
for every x E E. For each pair of non-negative integers n and fc, let 

二 {x & E : - f(x)\<l/n, for all j> A;}. 

Now fix n and note that C Eg +1 , and / E as k tends to infinity. 
By Corollary 3.3, we find that there exists k n such that m(E — E^ n ) < 
l/2 n . By construction, we then have 

\fj(x) — f(x)\ < 1/n whenever j > k n and x G E^. 

We choose N so that Y1^=n 2 _n < e/2, and let 

= n E L- 

n>N 


We first observe that 

oo 

m(E - I £ ) < Z m(E — EU < e/2. 

n=N 
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Next, if 5 > 0, we choose n> N such that 1/n < 5, and note that x G 
A e implies x G E: 心 We see therefore that \fj(x) — f(x)\ < 5 whenever 
j > k n . Hence 九 converges uniformly to / on A e . 

Finally, using Theorem 3.4 choose a closed subset A e C A e with m{A e — 
A e ) < e/2. As a result, we have m(E — A e ) < e and the theorem is 
proved. 

The next theorem attests to the validity of the second of Littlewood’s 
principle. 

Theorem 4.5 (Lusin) Suppose f is measurable and finite valued on E 
with E of finite measure. Then for every e > 0 there exists a closed set 
F e , with 

F e C and m(E — F e ) < e 
and such that /|_p e is continuous. 

By f\p e we mean the restriction of / to the set F e . The conclusion of 
the theorem states that if / is viewed as a function defined only on F e , 
then / is continuous. However, the theorem does not make the stronger 
assertion that the function / defined on E is continuous at the points of 

F e . 

Proof. Let f n be a sequence of step functions so that / n —^ / a.e. 
Then we may find sets E n so that m(E n ) < l/2 n and f n is continuous 
outside E n . By Egorov’s theorem, we may find a set A e /^ on which 
/ n —^ / uniformly and m(E — A e /^) < e/3. Then we consider 

= A e / 3 _ U 五 n 
n>N 

for N so large that ^2 n>N l/2 n < e/3. Now for every n> N the function 
f n is continuous on F ’； thus / (being the uniform limit of {/ n }) is also 
continuous on F r . To finish the proof, we merely need to approximate 
the set F f by a closed set F e C F f such that m(F’ 一 F e ) < e/3. 

5* The Brunn-Minkowski inequality 

Since addition and multiplication by scalars are basic features of vector 
spaces, it is not surprising that properties of these operations arise in a 
fundamental way in the theory of Lebesgue measure on M. d . We have al¬ 
ready discussed in this connection the translation-invariance and relative 
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dilation-invariance of Lebesgue measure. Here we come to the study of 
the sum of two measurable sets A and S, defined as 

A-\- B = {x eR d : x = x r -\ - x" with x f e A and x" G B}. 

This notion is of importance in a number of questions, in particular in 
the theory of convex sets; we shall apply it to the isoperimetric problem 
in Chapter 3. 

In this regard the first (admittedly vague) question we can pose is 
whether one can establish any general estimate for the measure oi AB 
in terms of the measures of A and B (assuming that these three sets 
are measurable). We can see easily that it is not possible to obtain an 
upper bound for m(A + B) in terms of m(A) and m(B). Indeed, simple 
examples show that we may have m(A) = m(B) = 0 while m(A -\- B) > 
0. (See Exercise 20.) 

In the converse direction one might ask for a general estimate of the 
form 


m(A + B) a > c a (m(A) a + m(B ) a ), 

where a is a positive number and the constant c a is independent of A 
and B. Clearly, the best one can hope for is c a = 1. The role of the 
exponent a can be understood by considering convex sets. Such sets 
A are defined by the property that whenever x and y are in A then 
the line segment joining them, {xt + y(l — t) : 0 < t < 1}, also belongs 
to A. If we recall the definition XA = {Ax, x E A} for X > 0, we note 
that whenever A is convex, then A + XA = (1 + X)A. However, m((l + 
X)A) = (1 + X) d m(A) : and thus the presumed inequality can hold only 
if (1 + X) da > 1 + X da , for all A > 0. Now 

(7) (a + b) 1 > a 7 + 6 7 if 7 > 1 and a, 6 > 0, 

while the reverse inequality holds if 0 < 7 < 1. (See Exercise 38.) This 
yields a > 1/d. Moreover, (7) shows that the inequality with the expo¬ 
nent 1/d implies the corresponding inequality with a > l/d, and so we 
are naturally led to the inequality 

( 8 ) m(A + B) 1 ^ > miA) 1 ^ + 

Before proceeding with the proof of ( 8 ), we need to mention a technical 
impediment that arises. While we may assume that A and B are mea¬ 
surable, it does not necessarily follow that then A-\- B is measurable. 
(See Exercise 13 in the next chapter.) However it is easily seen that this 
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difficulty does not occur when, for example, A and B are closed sets, or 
when one of them is open. (See Exercise 19.) 

With the above considerations in mind we can state the main result. 

Theorem 5.1 Suppose A and B are measurable sets in and their 
sum A-\- B is also measurable. Then the inequality (8) holds. 

Let us first check (8) when A and B are rectangles with side lengths 
{aj}j =1 and {〜}》 =1 , respectively. Then (8) becomes 

/ d \ 1 / d / d \ 1 / d ( d \ 1 / d 

⑼ （ n (〜+ ) 之 （ n 〜 ） + ( n 〜）， 

\j=i ) \i=i / \j=i / 


which by homogeneity we can reduce to the special case where aj + 
bj = 1 for each j. In fact, notice that if we replace aj, bj by Xjdj ， Xjbj, 
with Aj > 0, then both sides of (9) are multiplied by (AiA 2 - - - Xd) 1 ^- 
We then need only choose Aj = (% + bj) -1 . With this reduction, the 
inequality (9) is an immediate consequence of the arithmetic-geometric 
inequality (Exercise 39) 

1 d ( d \ 1，d 

- Xj > ( Xj J , for all Xj > 0: 

a j=i \j=i J 


we add the two inequalities that result when we set Xj = aj and Xj = 
respectively. 

We next turn to the case when each A and B are the union of finitely 
many rectangles whose interiors are disjoint. We shall prove (8) in this 
case by induction on the total number of rectangles in A and B. We 
denote this number by n. Here it is important to note that the desired 
inequality is unchanged when we translate A and B independently. In 
fact, replacing A by Ah and B by B + h’ replaces A-\-B by 
h + and thus the corresponding measures remain the same. We now 
choose a pair of disjoint rectangles R\ and i ?2 in the collection making up 
A, and we note that they can be separated by a coordinate hyperplane. 
Thus we may assume that for some j, after translation by an appropriate 
h, Ri lies in = AD {xj < 0}, and R 2 in A + = A D {0 < Xj}. Observe 
also that both A+ and contain at least one less rectangle than A does, 
and A = U A + . 

We next translate B so that = B D {xj < 0} and B + = B il {xj > 
0 } satisfy 

m(B±) m(A±) 

m(B) m(A) 
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However, A-\- B D {A + + B + ) U (A- + B_), and the union on the right 
is essentially disjoint, since the two parts lie in different half-spaces. 
Moreover, the total number of rectangles in either and B+, or A- 
and B- is also less than n. Thus the induction hypothesis applies and 


m(A B) > m(A + + B + ) + m(A_ + B_) 

> (miA+y^ + m{B + ) 1 / d ) d + {m(A_) 1 / d + m{B.Y/ d ) C 


m(A^) 


1 + 


m(B) 


1/d 


m(A-) 


'm(B) 

、 m ㈤. 


1/d 




which gives the desired inequality (8) when A and B are both finite 
unions of rectangles with disjoint interiors. 

Next, this quickly implies the result when A and B are open sets of 
finite measure. Indeed, by Theorem 1.4, for any e > 0 we can find unions 
of almost disjoint rectangles A e and B e , such that A e C A, C B, with 
m(A) < m(A e ) + e and m(B) < m(B e ) + e. Since A-\- B D A e B e , the 
inequality (8) for A e and B e and a passage to a limit gives the desired 
result. From this, we can pass to the case where A and B are arbitrary 
compact sets, by noting first that A-\- B is then compact, and that if 
we define A e = {x : d(x, A) < e}, then A e are open, and \ A as e —>• 
0. With similar definitions for B e and (A + B) e , we observe also that 
A-\- B C A e B e C (A-\- B) 2e . Hence, letting e —• 0, we see that (8) for 
A 6 and B e implies the desired result for A and B. The general case, 
in which we assume that A, B, and A-\- B are measurable, then follows 
by approximating A and B from inside by compact sets, as in (iii) of 
Theorem 3.4. 


6 Exercises 

1. Prove that the Cantor set C constructed in the text is totally disconnected and 
perfect. In other words, given two distinct points x, y G C, there is a point z ^ C 
that lies between x and y, and yet C has no isolated points. 

[Hint: li x,y E C and \x — y\ > l/3 fc , then x and y belong to two different intervals 
in Ck. Also, given any x E C there is an end-point of some interval in Ck that 
satisfies x ^ yk and \x — yk\ < l/3 fc .] 


2. The Cantor set C can also be described in terms of ternary expansions. 
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(a) Every number in [0,1] has a ternary expansion 

oo 

x = akS~ k , where = 0,1, or 2. 

k=l 

Note that this decomposition is not unique since, for example, 1/3 = TIT -2 2/3 fc . 

Prove that x E C if and only if x has a representation as above where every 
ak is either 0 or 2. 

(b) The Cantor-Lebesgue function is defined on C by 

OO J 

F(x) = y^ -^ if x = XI fell afc3 _fc , where b k = a k /2. 

k=l 

In this definition, we choose the expansion of x in which = 0 or 2. 

Show that F is well defined and continuous on C, and moreover F(0) = 0 as 
well as F(l) = 1. 

(c) Prove that F : C —> [0,1] is surjective, that is, for every y G [0,1] there exists 
x E C such that F(x) = y. 

(d) One can also extend F to be a continuous function on [0,1] as follows. Note 
that if (a, b) is an open interval of the complement of C, then F(a) = F(b). 
Hence we may define F to have the constant value F(a) in that interval. 

A geometrical construction of F is described in Chapter 3. 

3. Cantor sets of constant dissection. Consider the unit interval [0,1], and 
let ^ be a fixed real number with 0 < ^ < 1 (the case f = 1/3 corresponds to the 
Cantor set C in the text). 

In stage 1 of the construction, remove the centrally situated open interval in 
[0,1] of length f. In stage 2, remove two central intervals each of relative length 
one in each of the remaining intervals after stage 1, and so on. 

Let Q denote the set which remains after applying the above procedure indefi¬ 
nitely. 6 

(a) Prove that the complement of Q in [0,1] is the union of open intervals of 
total length equal to 1. 

(b) Show directly that m*(C^) = 0. 

[Hint: After the fc th stage, show that the remaining set has total length = (1 _ ^) k .] 

4. Cantor-like sets. Construct a closed set C so that at the k th stage of the 
construction one removes 2 fc_1 centrally situated open intervals each of length £fc, 
with 

£i+2£ 2 + ■■■ + 2 fe_1 4 < 1. 


6 The set we call is sometimes denoted by C i-g . 
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(a) If £j are chosen small enough, then 2 fc_1 豸 fc < 1. In this case, show 

that m(C) > 0, and in fact, m{C) = 1 — 2 k ~ 1 lk- 

(b) Show that if x G C, then there exists a sequence of points {xn}^=\ such 
that x n ^ C, yet x n ^ x and cc n 6 I n , where I n is a sub-interval in the 
complement of C with \I n \ 0. 

(c) Prove as a consequence that C is perfect, and contains no open interval. 

(d) Show also that C is uncountable. 

5. Suppose E is a, given set, and O n is the open set: 

On = {x : d(x, E) < 1/n}. 

Show: 

(a) If E is compact, then m(E) = limn-^oo m(O n )- 

(b) However, the conclusion in (a) may be false for E closed and unbounded; or 
E open and bounded. 

6. Using translations and dilations, prove the following: Let 5 be a ball in R d of 
radius r. Then m(B) = Vdr d , where Vd = m(Bi), and B\ is the unit ball, B\ = 
{x ER d : \x\ < 1}. 

A calculation of the constant Vd is postponed until Exercise 14 in the next 
chapter. 

7. If 5 = (5i，• •. ， Sd) is a d-tuple of positive numbers > 0, and is a subset of 

we define SE by 

SE = {(^lxi, ... ,5dXd) : where (: ci,... ,Xd) G E}. 

Prove that SE is measurable whenever E is measurable, and 

m(6E) = 8 1 --8 d m(E). 

8. Suppose L is a linear transformation of IR d . Show that if is a measurable 
subset of IR d , then so is L(E), by proceeding as follows: 

(a) Note that if E is compact, so is L(E). Hence if E is an F a set, so is L(E). 

(b) Because L automatically satisfies the inequality 

\L(x) — L{x')\ < M\x — x\ 

for some M, we can see that L maps any cube of side length £ into a 
cube of side length CdM£, with Cd = 2y/d. Now if m(E) = 0, there is a 
collection of cubes {Qj} such that E C Qj, and m(Qj) < e. Thus 
m^(L(E)) < c’e, and hence m(L(E)) = 0. Finally, use Corollary 3.5. 
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One can show that m(L(E)) = \ det L\ m(E) •’ see Problem 4 in the next chapter. 


9. Give an example of an open set O with the following property: the boundary 
of the closure of O has positive Lebesgue measure. 

[Hint: Consider the set obtained by taking the union of open intervals which are 
deleted at the odd steps in the construction of a Cantor-like set.] 

10. This exercise provides a construction of a decreasing sequence of positive 
continuous functions on the interval [0,1], whose point wise limit is not Riemann 
integrable. 

Let C denote a Cantor-like set obtained from the construction detailed in Exer¬ 
cise 4, so that in particular m(C) > 0. Let F± denote a piecewise-linear and contin¬ 
uous function on [0,1], with = 1 in the complement of the first interval removed 
in the construction of C, Fi = 0 at the center of this interval, and 0 < F±(x) < 1 for 
all x. Similarly, construct 丹 =1 in the complement of the intervals in stage two of 
the construction of C, with F 2 = 0 at the center of these intervals, and 0 < F 2 < 1. 
Continuing this way, let f n = F\ • F 2 ■ • • F n (see Figure 5). 





F 2 



e 




Figure 5. Construction of {F n } in Exercise 10 



Prove the following: 

(a) For all n > 1 and all x G [0,1], one has 0 < f n (x) < 1 and f n (x) > f n -\-i{x). 
Therefore, f n (x) converges to a limit as n ^ 00 which we denote by f(x). 

(b) The function f is discontinuous at every point of C. 

[Hint: Note that f(x) = 1 if a: G C, and find a sequence of points {x n } so 
that x n ^ x and f(x n ) = 0.] 

Now f fn (x) dx is decreasing, hence f f n converges. However, a bounded func¬ 
tion is Riemann integrable if and only if its set of discontinuities has measure zero. 
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(The proof of this fact, which is given in the Appendix of Book I, is outlined in 
Problem 4.) Since / is discontinuous on a set of positive measure, we find that / 
is not Riemann integrable. 

11. Let A be the subset of [0,1] which consists of all numbers which do not have 
the digit 4 appearing in their decimal expansion. Find m(A). 

12. Theorem 1.3 states that every open set in R is the disjoint union of open 
intervals. The analogue in ]R d , d > 2, is generally false. Prove the following: 

(a) An open disc in M 2 is not the disjoint union of open rectangles. 

[Hint: What happens to the boundary of any of these rectangles?] 

(b) An open connected set D is the disjoint union of open rectangles if and only 
if Q is itself an open rectangle. 


13. The following deals with Gs and Fa sets. 

(a) Show that a closed set is a Gs and an open set an F a . 

[Hint: If F is closed, consider O n = {x : d(x,F) < 1/n}.] 

(b) Give an example of an F a which is not a Gs. 

[Hint: This is more difficult; let F be sl denumerable set that is dense.] 

(c) Give an example of a Bor el set which is not a Gs nor an F a . 


14. The purpose of this exercise is to show that covering by a finite number of 
intervals will not suffice in the definition of the outer measure m*. 

The outer Jordan content J*(E) of a set ^ in IR is defined by 

N 

j = l 

where the inf is taken over every finite covering E C U^Li Ij, by intervals Ij. 

(a) Prove that J*(E) = J*(E) for every set E (here E denotes the closure of 

E). 

(b) Exhibit a countable subset 五 C [0,1] such that J*(E) = 1 while m^(E) = 0. 


15. At the start of the theory, one might define the outer measure by taking 
coverings by rectangles instead of cubes. More precisely, we define 

oo 

mf(E) = ini 
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where the inf is now taken over all countable coverings E C U^=i Rj by (closed) 
rectangles. 

Show that this approach gives rise to the same theory of measure developed in 
the text, by proving that m*(E) = rri^(E) for every subset E of IR d . 

[Hint: Use Lemma 1.1.] 

16. The Borel-Cantelli lemma. Suppose {Ek}kLi is a countable family of 
measuable subsets of IR d and that 

oo 

m(E k ) < oo. 

k=l 


Let 


E = {x E R d : x E Ek, for infinitely many k} 

= limsup(_Efc). 
k — ^oo 

(a) Show that E is measurable. 

(b) Prove m(E) = 0. 

[Hint: Write S = nr=iU> n ^-] 


17. Let {fn} be a sequence of measurable functions on [0,1] with \f n {x)\ < oo for 
a.e x. Show that there exists a sequence c n of positive real numbers such that 


fn(x) 

Cn 


a.e. x 


[Hint: Pick c n such that m({x : |/ n (^)/c n | > 1/n}) < 2 _n , and apply the Borel- 
Cantelli lemma.] 

18. Prove the following assertion: Every measurable function is the limit a.e. of a 
sequence of continuous functions. 

19. Here are some observations regarding the set operation A-\- B. 

(a) Show that if either A and B is open, then AB is open. 

(b) Show that if A and B are closed, then A-\- B is measurable. 

(c) Show, however, that A-\- B might not be closed even though A and B are 
closed. 

[Hint: For (b) show that AB is a,n F a set.] 


20. Show that there exist closed sets A and B with m(A) = m(B) = 0, but m(A + 
B) > 0: 
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(a) In R, let A = C (the Cantor set), B = C/2. Note that A-\- B D [0,1]. 

(b) In M 2 , observe that A = I x {0} and B = {0} x I (where I = [0,1]), then 
A-\-B = I x I. 


21. Prove that there is a continuous function that maps a Lebesgue measurable 
set to a non-measurable set. 

[Hint: Consider a non-measurable subset of [0,1], and its inverse image in C by the 
function F in Exercise 2.] 


22. Let x[o,i] be the characteristic function of [0,1]. Show that there is no every¬ 
where continuous function / on IR such that 

f(x) = X[o,i] ( x ) almost everywhere. 


23. Suppose /(x, y) is a function on IR 2 that is separately continuous: for each 
fixed variable, / is continuous in the other variable. Prove that / is measurable 
on R 2 . 

[Hint: Approximate / in the variable x by piecewise-linear functions f n so that 
f n — f pointwise.] 

24. Does there exist an enumeration {r n }S=i of the rationals, such that the 
complement of 



in R is non-empty? 

[Hint: Find an enumeration where the only rationals outside of a fixed bounded 
interval take the form r n , with n = m 2 for some integer m.] 

25. An alternative definition of measurability is as follows: E is measurable if for 
every e > 0 there is a closed set F contained in E with m* (E — F) < e. Show that 
this definition is equivalent with the one given in the text. 

26. Suppose A C E C B, where A and B are measurable sets of finite measure. 
Prove that if m(A) = m(B), then E is measurable. 

27. Suppose Ei and E 2 are a pair of compact sets in IR d with E\ C -E2, and let 
a = m(Ei) and b = m ( 五 2 ). Prove that for any c with a < c < b, there is a compact 
set E with E\ (Z E d E 2 and m(E) = c. 

[Hint: As an example, if d = 1 and E is a, measurable subset of [0,1], consider 
m(E fl [0,t]) as a function oi t.] 
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28. Let 五 be a subset of IR with m*(E) > 0. Prove that for each 0 < a < 1, there 
exists an open interval I so that 

m*(E fl /) > a 

Loosely speaking, this estimate shows that E contains almost a whole interval. 
[Hint: Choose an open set O that contains E, and such that m^{E) > am*(0). 
Write O as the countable union of disjoint open intervals, and show that one of 
these intervals must satisfy the desired property.] 

29. Suppose E is a, measurable subset of M. with m{E) > 0. Prove that the 
difference set of E, which is defined by 

G M : z = x — y for some x,y G E}, 

contains an open interval centered at the origin. 

If E contains an interval, then the conclusion is straightforward. In general, one 
may rely on Exercise 28. 

[Hint: Indeed, by Exercise 28, there exists an open interval I so that m(E fl J) 2 
(9/10) m(I). If we denote E (11 by Eo, and suppose that the difference set of Eq 
does not contain an open interval around the origin, then for arbitrarily small a the 
sets Eo, and Eo a are disjoint. From the fact that {Eq U {Eq + a)) C (/ U (/ + a)) 
we get a contradiction, since the left-hand side has measure 2m(Eo), while the 
right-hand side has measure only slightly larger than m(I).] 

A more general formulation of this result is as follows. 

30. If E and F are measurable, and m{E) > 0, m{F) > 0, prove that 

E F = {x y : x G E, x ^ F} 

contains an interval. 

31. The result in Exercise 29 provides an alternate proof of the non-measurability 
of the set J\f studied in the text. In fact, we may also prove the non-measurability 
of a set in M. that is very closely related to the set A/*. 

Given two real numbers x and y, we shall write as before that x ~ y whenever 
the difference x — y is rational. Let Af* denote a set that consists of one element in 
each equivalence class of 〜. Prove that Af* is non-measurable by using the result 
in Exercise 29. 

[Hint: If A/"* is measurable, then so are its translates J\f* = J\f* + r n , where {r n }^ =1 
is an enumeration of Q. How does this imply that > 0? Can the difference 

set of J\f* contain an open interval centered at the origin?] 

32. Let Af denote the non-measurable subset of I = [0,1] constructed at the end 
of Section 3. 

(a) Prove that if 五 is a measurable subset of J\f, then m(E) = 0. 
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(b) If G is a subset of R with m*(G) > 0, prove that a subset of G is non- 
measurable. 

[Hint: For (a) use the translates of E by the rationals.] 

33. Let J\T denote the non-measurable set constructed in the text. Recall from the 
exercise above that measurable subsets of Af have measure zero. 

Show that the set Af c = I — Af satisfies = 1, and conclude that if Ei = 

J\f and E 2 = J\f c , then 

m^(E 1 ) + m* ( 五 2 ) — m *U E 2 ), 
although E\ and E 2 are disjoint. 

[Hint: To prove that m^(J\f c ) = 1, argue by contradiction and pick a measurable 
set U such that U G I, J\f c C U and m*(U) < 1 — e.] 

34. Let Ci and C 2 be any two Cantor sets (constructed in Exercise 3). Show that 
there exists a function F : [0,1] —> [0,1] with the following properties: 

(i) F is continuous and bijective, 

(ii) F is monotonically increasing, 

(iii) F maps Ci surjectively onto C 2 . 

[Hint: Copy the construction of the standard Cantor-Lebesgue function.] 

35. Give an example of a measurable function / and a continuous function $ so 
that / o $ is non-measurable. 

[Hint: Let ^ : C\ ^ C 2 as in Exercise 34, with m{C\) > 0 and m{C 2 ) = 0. Let 
AT C Ci be non-measurable, and take / = X 中 (at).] 

Use the construction in the hint to show that there exists a Lebesgue measurable 
set that is not a Borel set. 

36. This exercise provides an example of a measurable function / on [0,1] such 
that every function g equivalent to / (in the sense that / and g differ only on a 
set of measure zero) is discontinuous at every point. 

(a) Construct a measurable set E C [0,1] such that for any non-empty open 
sub-interval I in [0,1], both sets E C\ I and E c C\ I have positive measure. 

(b) Show that f = xe has the property that whenever g{x) = f(x) a.e x, then 
g must be discontinuous at every point in [0,1]. 

[Hint: For the first part, consider a Cantor-like set of positive measure, and add in 
each of the intervals that are omitted in the first step of its construction, another 
Cantor-like set. Continue this procedure indefinitely.] 

37. Suppose r is a curve y = f(x) in R 2 , where / is continuous. Show that 

m(r) = 0 . 
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[Hint: Cover r by rectangles, using the uniform continuity of /.] 

38. Prove that (a + 6) 7 > a 7 + 6 7 whenever 7 > 1 and a, 6 > 0. Also, show that 
the reverse inequality holds when 0 < 7 < 1 . 

[Hint: Integrate the inequality between (a + t ) 7 — 1 and t 7_1 from 0 to b.] 

39. Establish the inequality 

( 10 ) Xl d + ^ > (^1 - - - XdY’ d for all Xj > 0, j = 1,... ,d 

by using backward induction as follows: 

(a) The inequality is true whenever d is a power of 2 (d = 2 k , k > 1). 

(b) If (10) holds for some integer d> 2, then it must hold for d — 1, that is, 
one has (yi + … + y d -i)/{d -- 1 ) > ( 2/1 ••- 2 /d-i) 1/(d_1) for all yj > 0 , with 
j = 

[Hint: For (a), if A: > 2, write (xi + … + x 2 k)/2 k as (A + 5)/2, where A = (xi -\- 
...+ x 2 k-i)/2 k ~ 1 , and apply the inequality when d = 2. For (b), apply the in¬ 
equality to m = 2 / 1 ， • •. ， Xd-i = yd-i and Xd = (yi H - h yd-i)/(d- 1 ).] 


7 Problems 


1. Given an irrational 工 ， one can show (using the pigeon-hole principle, for exam¬ 
ple) that there exists infinitely many fractions p/q, with relatively prime integers 
p and q such that 


However, prove that the set of those a: G M. such that there exist infinitely many 
fractions p/q, with relatively prime integers p and q such that 


^-- q (or < l/g 2+e ), 

is a set of measure zero. 

[Hint: Use the Borel-Cantelli lemma.] 


2 . Any open set Q, can be written as the union of closed cubes, so that = |J Qj 
with the following properties 

(i) The Qj’s have disjoint interiors. 

(ii) d(Qj, Q c ) ^ side length of Qj. This means that there are positive constants 
c and C so that c < d(Qj,Q c )/£(Qj) < (7, where £(Qj) denotes the side 
length of Qj. 
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3. Find an example of a measurable subset C of [0,1] such that m(C) = 0, yet the 
difference set of C contains a non-trivial interval centered at the origin. Compare 
with the result in Exercise 29. 

[Hint: Pick the Cantor set C = C. For a fixed a G [—1,1], consider the line y = 
x a in the plane, and copy the construction of the Cantor set, but in the cube 
Q = [0,1] x [0,1]. First, remove all but four closed cubes of side length 1/3, one at 
each corner of Q; then, repeat this procedure in each of the remaining cubes (see 
Figure 6). The resulting set is sometimes called a Cantor dust. Use the property 
of nested compact sets to show that the line intersects this Cantor dust.] 




Figure 6. Construction of the Cantor dust 


4. Complete the following outline to prove that a bounded function on an interval 
[a, b] is Riemann integrable if and only if its set of discontinuities has measure zero. 
This argument is given in detail in the appendix to Book I. 

Let / be a bounded function on a compact interval J, and let /(c, r) denote 
the open interval centered at c of radius r > 0. Let osc(f,c,r) = sup |/(o:) — /(y)|, 
where the supremum is taken over all a;, y G J n /(c, r), and define the oscillation 
of / at c by osc(/, c) = linv-^ 。 osc(/, c, r). Clearly, / is continuous at c G J if and 
only if osc(/, c) = 0 . 

Prove the following assertions: 

(a) For every e > 0, the set of points c in J such that osc(/, c) > e is compact. 

(b) If the set of discontinuities of / has measure 0, then / is Riemann integrable. 

[Hint: Given e > 0 let A e = {c £ J : osc(/, c) > e}. Cover A e by a finite 
number of open intervals whose total length is < e. Select an appropriate 
partition of J and estimate the difference between the upper and lower sums 
of / over this partition.] 
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(c) Conversely, if / is Riemann integrable on J, then its set of discontinuities 
has measure 0. 

[Hint: The set of discontinuities of f is contained in |J n A 1 / n . Choose a 
partition P such that C/(/, P) — L(/, P) < e/n. Show that the total length 
of the intervals in P whose interior intersect Ai/ n is < e.] 


5. Suppose E is measurable with m(E) < oo, and 


E = Ei U E 2 , Ei D E 2 = 0. 

If m(E) = + m*(_E2), then E\ and E 2 are measurable. 

In particular, ii E C Q, where Q is a finite cube, then E is measurable if and 

only if m(Q) = m*(E) + m*(Q — E). 

6.* The fact that the axiom of choice and the well-ordering principle are equivalent 
is a consequence of the following considerations. 

One begins by defining a partial ordering on a set ^ to be a binary relation < 
on the set E that satisfies: 

(i) x < x for all x G E. 

(ii) x < y and y < x, then x = y. 

(iii) li x < y and y < z, then x < z. 

If in addition x < y or y < x whenever x,y G E, then < is a linear ordering of E. 

The axiom of choice and the well-ordering principle are then logically equivalent 
to the Hausdorff maximal principle: 

Every non-empty partially ordered set has a (non-empty) maximal 
linearly ordered subset. 


In other words, if E is partially ordered by <, then E contains a non-empty subset 
F which is linearly ordered by < and such that if F is contained in a set G also 
linearly ordered by <, then F = G. 

An application of the Hausdorff maximal principle to the collection of all well- 
orderings of subsets of E implies the well-ordering principle for E. However, the 
proof that the axiom of choice implies the Hausdorff maximal principle is more 
complicated. 

7.* Consider the curve F = {y = f(x)} in IR 2 , 0 < x < 1. Assume that / is twice 
continuously differentiable in 0 < a: < 1. Then show that m(T + T) > 0 if and only 
if r + r contains an open set, if and only if / is not linear. 


8 .* Suppose A and B are open sets of finite positive measure. Then we have 
equality in the Brunn-Minkowski inequality (8) if and only if A and B are convex 
and similar, that is, there are a <5 > 0 and an /i G such that 


A = 8B + h. 


Integration Theory 


...amongst the many definitions that have been succes¬ 
sively proposed for the integral of real-valued functions 
of a real variable, I have retained only those which, in 
my opinion, are indispensable to understand the trans¬ 
formations undergone by the problem of integration, 
and to capture the relationship between the notion of 
area, so simple in appearance, and certain more com¬ 
plicated analytical definitions of the integral. 

One might ask if there is sufficient interest to oc¬ 
cupy oneself with such complications, and if it is not 
better to restrict oneself to the study of functions that 
necessitate only simple definitions.... As we shall see 
in this course, we would then have to renounce the 
possibility of resolving many problems posed long ago, 
and which have simple statements. It is to solve these 
problems, and not for love of complications, that I 
have introduced in this book a definition of the inte¬ 
gral more general than that of Riemann. 

H. Lebesgue, 1903 


1 The Lebesgue integral: basic properties and conver¬ 
gence theorems 

The general notion of the Lebesgue integral on will be defined in a 
step-by-step fashion, proceeding successively to increasingly larger fam¬ 
ilies of functions. At each stage we shall see that the integral satisfies 
elementary properties such as linearity and monotonicity, and we prove 
appropriate convergence theorems that amount to interchanging the in¬ 
tegral with limits. At the end of the process we shall have achieved a 
general theory of integration that will be decisive in the study of further 
problems. 

We proceed in four stages, by progressively integrating: 

1 . Simple functions 

2. Bounded functions supported on a set of finite measure 

3. Non-negative functions 
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4. Integrable functions (the general case). 

We emphasize from the onset that all functions are assumed to be mea¬ 
surable. At the beginning we also consider only finite-valued functions 
which take on real values. Later we shall also consider extended-valued 
functions, and also comp lex-valued functions. 

Stage one: simple functions 

Recall from the previous chapter that a simple function (/? is a finite sum 


A x ) ^Y2 a kXE k (x) 


⑴ 


where the Ek are measurable sets of finite measure and the ak are con¬ 
stants. A complication that arises from this definition is that a simple 
function can be written in a multitude of ways as such finite linear com¬ 
binations; for example, 0 = \e — Xe for any measurable set E of finite 
measure. Fortunately, there is an unambiguous choice for the represen¬ 
tation of a simple function, which is natural and useful in applications. 

The canonical form of (f is the unique decomposition as in (1), where 
the numbers ak are distinct and non-zero, and the sets Ek are disjoint. 

Finding the canonical form of p is straightforward: since cp can take 
only finitely many distinct and non-zero values, say ci ， … ， cm, we may 
set Fk = {x : (p(x) = Cfc}, and note that the sets Fk are disjoint. There¬ 
fore (f = c kXF k is the desired canonical form of (p. 

If is a simple function with canonical form (p(x) = CkXF k (^), 

then we define the Lebesgue integral of ip by 



If is a measurable subset of R d with finite measure, then (p(x)xe(^) 
is also a simple function, and we define 



To emphasize the choice of the Lebesgue measure m in the definition of 
the integral, one sometimes writes 
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for the Lebesgue integral of (p. In fact, as a matter of convenience, we 
shall often write J ^p(x) dx or simply J ip for the integral of ip over 

Proposition 1.1 The integral of simple functions defined above satisfies 
the following properties: 

(i) Independence of the representation. If(p = Y^ =1 akXE k is any rep¬ 
resentation of ip, then 



N 

= 〉 ^ Qfc??7>(.£/fc). 
k=l 


(ii) Linearity. If (p and ^ are simple, and a, 6 G M, then 


(a(f + 60) = a (f -\- b / 


(iii) Additivity. If E and F are disjoint subsets of with finite mea¬ 
sure, then 



(iv) Monotonicity. If (p < ^ are simple, then 



(v) Triangle inequality. If (p is a simple function, then so is \cp\, and 



Proof. The only conclusion that is a little tricky is the first, which 
asserts that the integral of a simple function can be calculated by us¬ 
ing any of its decompositions as a linear combination of characteristic 
functions. 

Suppose that ip = akXE k , where we assume that the sets Ek are 

disjoint, but we do not suppose that the numbers ak are distinct and non¬ 
zero. For each distinct non-zero value a among the {a^} we define E’ a = 
(J where the union is taken over those indices k such that = a. 
Note then that the sets E’ a are disjoint, and m(£^) = where 
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the sum is taken over the same set of k’s. Then clearly (p = a XE ' a , 
where the sum is over the distinct non-zero values of {a^；}. Thus 



N 

^ am(E ' a ) 二 ^2 akm(E k ). 

k=l 


Next, suppose (f = CLkXE k ^ where we no longer assume that the 

are disjoint. Then we can “refine” the decomposition IJ^Li Ek by finding 
sets E; ， ... ， E: with the property that [j^ = i Ek = [j^ =1 the 
sets Ej (j = 1 ,..., n) are mutually disjoint; and for each k, Ek = [J , 
where the union is taken over those E 】 that are contained in Ek. (A proof 
of this elementary fact can be found in Exercise 1.) For each j, let now 
a】=^2 ak, with the summation taken over all k such that Ek contains 
Ej. Then clearly (p = o^Xe* - However, this is a decomposition 

already dealt with above because the are disjoint. Thus 

卜 = y^q*m(£； 7 *) = a k m(E;) = y^a k m(Ek )， 

J E k DE* 


and conclusion (i) is established. 

Conclusion (ii) follows by using any representation of (p and 也 and 
the obvious linearity of (i). 

For the additivity over sets, one must note that if E and F are disjoint, 
then 


Xeuf = Xe~\~ Xf, 

and we may use the linearity of the integral to see that f EuF ^ = f E cp + 

If^- 

If ry > 0 is a simple function, then its canonical form is everywhere non¬ 
negative, and therefore J 77 > 0 by the definition of the integral. Applying 
this argument to ^ — ip gives the desired monotonicity property. 

Finally, for the triangle inequality, it suffices to write ip in its canonical 
form cp = a kXE k and observe that 

N 

M ^^W\xE k {x)- 

k=l 


Therefore, by the triangle inequality applied to the definition of the in¬ 
tegral, one sees that 


N 


y^a k m[E k ) 


N 


< ^2 \ a k\ m ( E k) 


M- 


k=l 
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Incidentally, it is worthwhile to point out the following easy fact: when¬ 
ever / and g are a pair of simple functions that agree almost everywhere, 
then f f = f g. The identity of the integrals of two functions that agree 
almost everywhere will continue to hold for the successive definitions of 
the integral that follow. 

Stage two: bounded functions supported on a set of finite 
measure 

The support of a measurable function / is defined to be the set of all 
points where / does not vanish, 

supp(/) = {x : f{x) 0}. 

We shall also say that / is supported on a set 五， if fix) = 0 whenever 

x 丰 E. 

Since / is measurable, so is the set supp(/). We shall next be interested 
in those bounded measurable functions that have m(supp(/)) < oo. 

An important result in the previous chapter (Theorem 4.2) states the 
following: if / is a function bounded by M and supported on a set E, then 
there exists a sequence {p n } of simple functions, with each ip n bounded 
by M and supported on E, and such that 

^n(^) ~^ / ⑷ for all x. 

The key lemma that follows allows us to define the integral for the class 
of bounded functions supported on sets of finite measure. 

Lemma 1.2 Let f be a bounded function supported on a set E of finite 
measure. If is any sequence of simple functions bounded by M, 

supported on E, and with (p n (x) —>• f(x) for a.e. x, then: 

(i) The limit lim / (f n exists. 

n —^cxd / 

(ii) If f = 0 a.e., then the limit lim / (f n equals 0. 

n—^oo J 

Proof. The assertions of the lemma would be nearly obvious if we 
had that (p n converges to / uniformly on E. Instead, we recall one of 
Littlewood’s principles, which states that the convergence of a sequence 
of measurable functions is “nearly” uniform. The precise statement lying 
behind this principle is Egorov’s theorem, which we proved in Chapter 1, 
and which we apply here. 
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Since the measure of E is finite, given e > 0 Egorov’s theorem guar¬ 
antees the existence of a (closed) measurable subset A e of E such that 
m(E — A e ) < e, and (p n — f uniformly on A e . Therefore, setting I n = 
f ip n we have that 


|/ n _ Im I ^ / |(^ n (X) _ (^772 (T) I 

J E 

= \^ n {x) ~ (p m {x)\dx-\- / Wn{x) - ip m {x)\dx 

JA e J E-A e 

< / Wn{x) - ^pm{x)\dx + 2Mm(E - A e ) 

JA e 

< / Wn{x) - ^Pm(x)\dx + 2Me. 

J A e 


By the uniform convergence, one has, for all x E A e and all large n and 
m, the estimate \ip n {x) — < e, so we deduce that 

\I n — Im\< m(E)e + 2Me for all large n and m. 


Since e is arbitrary and m(E) < oo, this proves that {I n } is a Cauchy 
sequence and hence converges, as desired. 

For the second part, we note that if / = 0, we may repeat the argument 
above to find that \I n \ < m(E)e + Me, which yields lim n ^oo / n = 0, as 
was to be shown. 

Using Lemma 1.2 we can now turn to the integration of bounded func¬ 
tions that are supported on sets of finite measure. For such a function / 
we define its Lebesgue integral by 

/ f(x) dx = lim / dx, 

n ^°° J 

where {p n } is any sequence of simple functions satisfying: \(f n \ < M, 
each (p n is supported on the support of /, and ^ f[x) for a.e. x 

as n tends to infinity. By the previous lemma, we know that this limit 
exists. 

Next, we must first show that f f is independent of the limiting se¬ 
quence {^Pn} used, in order for the integral to be well-defined. There¬ 
fore, suppose that {^ n } is another sequence of simple functions that is 
bounded by M, supported on supp(/), and such that ^ n (x) —>• f(x) for 
a.e. x as n tends to infinity. Then, if r] n = ip n — the sequence {rj n } 
consists of simple functions bounded by 2M, supported on a set of fi¬ 
nite measure, and such that rj n 0 a.e. as n tends to infinity. We may 
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therefore conclude, by the second part of the lemma, that J " n —>• 0 as n 
tends to infinity. Consequently, the two limits 


lim / (f n (x) dx and lim / ^ n (x) dx 


(which exist by the lemma) are indeed equal. 

If 五 is a subset of with finite measure, and / is bounded with 
m(supp(/)) < oo, then it is natural to define 



x)xE(x)dx. 


Clearly, if / is itself simple, then f f as defined above coincides with 
the integral of simple functions studied earlier. This extension of the def¬ 
inition of integration also satisfies all the basic properties of the integral 
of simple functions. 


Proposition 1.3 Suppose f and g are bounded functions supported on 
sets of finite measure. Then the following properties hold. 


(i) Linearity. Jf a, b G R，then 



(af + bg) 



f + b I 


g. 


(ii) Additivity. If E and F are disjoint subsets ofR d , then 





f. 


(iii) Monotonicity. If f < g, then 




g. 


(iv) Triangle inequality. \f\ is also bounded, supported on a set of finite 
measure, and 
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All these properties follow by using approximations by simple functions, 
and the properties of the integral of simple functions given in Proposi¬ 
tion 1.1. 

We are now in a position to prove the first important convergence 
theorem. 


Theorem 1.4 (Bounded convergence theorem) Suppose that {/ n } 
is a sequence of measurable functions that are all bounded by M ， are 
supported on a set E of finite measure, and f n (x) —>• f(x) a.e. x as n ^ 
oo. Then f is measurable, bounded, supported on E for a.e. x, and 

J l/n - /| ^ 0 as oo. 

Consequently, 

j f n 4 j f as m CO. 

Proof. From the assumptions one sees at once that / is bounded by M 
almost everywhere and vanishes outside except possibly on a set of 
measure zero. Clearly, the triangle inequality for the integral implies 
that it suffices to prove that J |/ n — /| —> 0 as n tends to infinity. 

The proof is a reprise of the argument in Lemma 1.2. Given e > 0, we 
may find, by Egorov’s theorem, a measurable subset A e of E such that 
m(E — A e ) < e and / n —>• / uniformly on A e . Then, we know that for 
all sufficiently large n we have \ f n (x) — f(x)\ < e for all x G A e . Putting 
these facts together yields 

\fn(x) - f(x)\dx< f \f n (x) - f{x)\dx+ f \fn(x) - f(x)\dx 
J A e JE-A e 

< em(E) + 2M m(E — A e ) 



for all large n. Since e is arbitrary, the proof of the theorem is complete. 


We note that the above convergence theorem is a statement about the 
interchange of an integral and a limit, since its conclusion simply says 


lim 

n—^oo 



fn 



lim f n - 

n—^oo 


A useful observation that we can make at this point is the following: if 
/ > 0 is bounded and supported on a set of finite measure E and f f = 0, 
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then f = 0 almost everywhere. Indeed, if for each integer A: > 1 we set 
Ek = {x E E : f(x) > 1/fc}, then the fact that k~ 1 XE k {x) < f(x) implies 


k~ 1 m(E k ) < f, 


by monotonicity of the integral. Thus m(Ek) = 0 for all fc, and since 
{x : f(x) > 0} = U==i Ek, we see that / = 0 almost everywhere. 


Return to Riemann integrable functions 

We shall now show that Riemann integrable functions are also Lebesgue 
integrable. When we combine this with the bounded convergence theo¬ 
rem we have just proved, we see that Lebesgue integration resolves the 
second problem in the Introduction. 

Theorem 1.5 Suppose f is Riemann integrable on the closed interval 
[a, b]. Then f is measurable, and 

[f(x) dx 二 f f{x) dx, 

J [a,b] J 

where the integral on the left-hand side is the standard Riemann integral, 
and that on the right-hand side is the Lebesgue integral. 

Proof. By definition, a Riemann integrable function is bounded, say 
\f(x)\ < M, so we need to prove that / is measurable, and then establish 
the equality of integrals. 

Again, by definition of Riemann integrability, 1 we may construct two 
sequences of step functions {^Pk} and {^k} that satisfy the following 
properties: \cpk(x)\ < M and |^fe(x)| < M for all x G [a, b] and k > 1, 

^Pl(x) < <： • • •< f < • • • S < ^i(x), 

pTZ nTZ pTZ 

lim / dx = lim / ^k{x)dx = / f (x) dx. 

k ^°°J[a,b] k ^°° J[a,b] J[a,b] 

Several observations are in order. First, it follows immediately from their 
definition that for step functions the Riemann and Lebesgue integrals 
agree; therefore 

( 3 ) 

fTZ pjC plZ pjC 

/ cfk(x) dx = / dx and / dx = / ^(x) dx 

J [a,b] J [a,b] J [a,b] J [a,b] 


and 

( 2 ) 


See also Section 1 of the Appendix in Book I. 
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for all fc > 1. Next, if we let 


(p{x) = lim (fk(x) and ^(x) = lim 



we have (^ < / < -0. Moreover, both (p and ^ are measurable (being the 
limit of step functions), and the bounded convergence theorem yields 



and 



This together with (2) and (3) yields 



and since 外 k — Pk H we must have ^ — (p > 0. By the observation 
following the proof of the bounded convergence theorem, we conclude 
that i/j — (p = 0 a.e., and therefore (p = ^ = f a.e., which proves that / 
is measurable. Finally, since 仰 一 ^ / almost everywhere, we have (by 
definition) 




Stage three: non-negative functions 

We proceed with the integrals of functions that are measurable and non¬ 
negative but not necessarily bounded. It will be important to allow 
these functions to be extended-valued, that is, these functions may take 
on the value +oo (on a measurable set). We recall in this connection the 
convention that one defines the supremum of a set of positive numbers 
to be +oo if the set is unbounded. 

In the case of such a function / we define its (extended) Lebesgue 
integral by 
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where the supremum is taken over all measurable functions g such that 
0 < g < f, and where g is bounded and supported on a set of finite 
measure. 

With the above definition of the integral, there are only two possible 
cases; the supremum is either finite, or infinite. In the first case, when 
f f(x) dx < oo, we shall say that / is Lebesgue integrable or simply 
integrable. 

Clearly, if E is any measurable subset of and / > 0, then fxE is 
also positive, and we define 



Simple examples of functions on that are integrable (or non-integrable) 
are given by 



\x\~ a if \x\ < 1, 
0 if \x\ > 1. 




Then f a is integrable exactly when a < d : while F a is integrable exactly 
when a > d. See the discussion following Corollary 1.10 and also Exer¬ 
cise 10. 

Proposition 1.6 The integral of non-negative measurable functions en¬ 
joys the following properties: 

(i) Linearity. If f,g > 0, and a, b are positive real numbers, then 



(ii) Additivity. If E and F are disjoint subsets ofR d , and f > 0, then 



(iii) Monotonicity. // 0 < / < ^, then 
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(iv) If g is integrable and 0 < f < g, then f is integrable. 

(v) If f is integrable, then f(x) < oo for almost every x. 

(vi) If f f = 0, then f(x) = 0 for almost every x. 

Proof. Of the first four assertions, only (i) is not an immediate 
consequence of the definitions, and to prove it we argue as follows. We 
take a = b = 1 and note that if (/? < / and 4 < 仏 where both cp and ^ are 
bounded and supported on sets of finite measure, then ^ + ^ < / + ^, 
and (f 寸 is also bounded and supported on a set of finite measure. 
Consequently 

I f+ I 9 -I if+9) - 

To prove the reverse inequality, suppose 77 is bounded and supported on a 
set of finite measure, and rj < f g. If we define rji(x) = min(/(x), 7 y(x)) 
and r ]2 = r] — we note that 


Vi< f and t] 2 < g. 


Moreover both 771 , 7/2 are bounded and supported on sets of finite mea¬ 
sure. Hence 



(m + m)= 





g. 


Taking the supremum over 77 yields the required inequality. 

To prove the conclusion (v) we argue as follows. Suppose Ek = {x : 
f(x) > A:}, and 五 oo = {x : f(x) = 00 }. Then 

j f > j XeJ > km(Ek), 

hence m(Ek) ― • 0 as fc —> 00 . Since Ek \ 五 00 , Corollary 3.3 in the pre¬ 
vious chapter implies that m(£' 00 ) = 0 . 

The proof of (vi) is the same as the observation following Theorem 1.4. 


We now turn our attention to some important convergence theorems 
for the class of non-negative measurable functions. To motivate the re¬ 
sults that follow, we ask the following question: Suppose / n > 0 and 
fn{ x ) f{ x ) for almost every x. Is it true that J f n dx ^ J f dx ? Un¬ 
fortunately, the example that follows provides a negative answer to this, 
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and shows that we must change our formulation of the question to obtain 
a positive convergence result. 


Let 



n if 0 < x < 1 /n, 
0 otherwise. 


Then f n (x) —> 0 for all x, yet J f n (x) dx = 1 for all n. In this particular 
example, the limit of the integrals is greater than the integral of the limit 
function. This turns out to be the case in general, as we shall see now. 

Lemma 1.7 (Fatou) Suppose {/ n } is a sequence of measurable func¬ 
tions with f n > 0. If lim n _, 00 f n (x) = f(x) for a.e. x, then 



Proof. Suppose 0 < ^ < /, where g is bounded and supported on a 
set E of finite measure. If we set g n {x) = min(^(x), /n(^)) ? then g n is 
measurable, supported on E, and g n {x) g(x) a.e., so by the bounded 
convergence theorem 



By construction, we also have g n ^ fn^ so that J < f f n , and therefore 



Taking the supremum over all g yields the desired inequality. 

In particular, we do not exclude the cases J f = oo, or lim inf n ^oo f n = 
oo. 

We can now immediately deduce the following series of corollaries. 

Corollary 1.8 Suppose f is a non-negative measurable function, and 
{fn} a sequence of non-negative measurable functions with f n {x) < f(x) 
and f n {x) —>• f(x) for almost every x. Then 



Proof. Since f n (x) < f(x) a.e x, we necessarily have J / n < J f for 
all n; hence 
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This inequality combined with Fatou’s lemma proves the desired limit. 

In particular, we can now obtain a basic convergence theorem for the 
class of non-negative measurable functions. Its statement requires the 
following notation. 

In analogy with the symbols / and \ used to describe increasing and 
decreasing sequences of sets, we shall write 


/n// 


whenever {/ 几 }^^ is a sequence of measurable functions that satisfies 
/n(^) ^ / n +i(^) a.e x, all n > 1 and lim f n (x) = f(x) a.e x. 

n—^oo 

Similarly, we write / n \ / whenever 
/n(^) ^ /n+i(^) a.e x, all n > 1 and lim f n (x) = f(x) a.e x. 

n—>-oo 

Corollary 1.9 (Monotone convergence theorem) Suppose {/ n } is 
a sequence of non-negative measurable functions with f n / f. Then 


lim 

n—>-oo 



fn 



The monotone convergence theorem has the following useful conse¬ 
quence: 


Corollary 1.10 Consider a series a k(x) ， where ak(x) > 0 zs mea¬ 

surable for every k>l. Then 



ak(x) dx 



ak(x) dx. 


V f 恥 ㈤ dx is finite, then the series converges for 

a.e. x. 


Proof. Let f n (x) = J2k=i a fc( x ) and f(x) = a k(x). The func¬ 

tions f n are measurable, f n (x) S / n +i(^), and f n {x) —> f(x) as n tends 
to infinity. Since 




afc(x) dx. 
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the monotone convergence theorem implies 



ak(x) dx 



ak(x) dx. 


If ^ J afc < oo, then the above implies that a k(x) is integrable, 

and by our earlier observation, we conclude that Ylh=i a k( x ) i s finite 
almost everywhere. 

We give two nice illustrations of this last corollary. 

The first consists of another proof of the Borel-Cantelli lemma (see 
Exercise 16, Chapter 1), which says that if 五 i, 五 2 ， … is a collection 
of measurable subsets with ^m{Ek) < 00 , then the set of points that 
belong to infinitely many sets has measure zero. To prove this fact, 
we let 


ak{x) = XE k (x), 


and note that a point x belongs to infinitely many sets Ek if and only 
if Ylh=i a k( x ) = Our assumption on ^ m(Ek) says precisely that 
f a k( x ) dx < 00 , and the corollary implies that a k( x ) i s finite 

except possibly on a set of measure zero, and thus the Borel-Cantelli 
lemma is proved. 

The second illustration will be useful in our discussion of approxima¬ 
tions to the identity in Chapter 3. Consider the function 


f(X) 


l^p+r Hx^=0, 

0 otherwise. 


We prove that / is integrable outside any ball, \x\ > e, and moreover 
I f(x) dx < ~, for some constant C > 0. 

Indeed, if we let = {x G R d : 2 k e < \x\ < 2 奸 1 e}, and define 


00 丄 

9{x ) 二 ^2a k (x) where a k (x) = (2 k e)d+1 XA k (x), 
k=0 〈 ) 


then we must have f(x) < g(x)^ and hence J / < f g. Since the set 
is obtained from = {1 < |x| < 2} by a dilation of factor 2 fc e, we have 
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by the relative dilation-invariance properties of the Lebesgue measure, 
that m(Ak) = (2 k e) d m(A). Also by Corollary 1.10, we see that 



m(A k ) 

(2 k e) d + 1 


k=0 


(2 k e) d 

{2 k e) d + 1 


C 

e 


where C = 2m(A). Note that the same dilation-invariance property in 
fact shows that 

dx If dx 

十 +1 — e J\X\>1 \ X \ d+1 ' 

See also the identity (7) below. 



Stage four: general case 

If / is any real-valued measurable function on we say that / is 
Lebesgue integrable (or just integrable) if the non-negative measur¬ 
able function |/| is integrable in the sense of the previous section. 

If / is Lebesgue integrable, we give a meaning to its integral as follows. 
First, we may define 

/+(x) = max(/(x),0) and f~(x) = max(-/(x), 0), 

so that both /+ and f~ are non-negative and /+ — /—=/. Since /士 < 
|/|, both functions /+ and f~ are integrable whenever / is, and we then 
define the Lebesgue integral of / by 

/，:/ /+ - /厂_ 

In practice one encounters many decompositions f = fi — f 2 ^ where 
/i ，/2 are both non-negative integrable functions, and one would expect 
that regardless of the decomposition of /, we always have 



In other words, the definition of the integral should be independent of the 
decomposition / = /1 — / 2 . To see why this is so, suppose f = gi — g 2 
is another decomposition where both 仍 and ^2 are non-negative and 
integrable. Since fi — f 2 = 9i ~ 92 we have /1 + 沒 2 = + / 2 ； but both 
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sides of this last identity consist of positive measurable functions, so the 
linearity of the integral in this case yields 



Since all integrals involved are finite, we find the desired result 



In considering the above definitions it is useful to keep in mind the 
following small observations. Both the integrability of /, and the value 
of its integral are unchanged if we modify / arbitrarily on a set of measure 
zero. It is therefore useful to adopt the convention that in the context 
of integration we allow our functions to be undefined on sets of measure 
zero. Moreover, if / is integrable, then by (v) of Proposition 1.6, it is 
finite-valued almost everywhere. Thus, availing ourselves of the above 
convention, we can always add two integrable functions / and 仏 since 
the ambiguity of / + 仏 due to the extended values of each, resides in a 
set of measure zero. Moreover, we note that when speaking of a function 
f, we are, in effect, also speaking about the collection of all functions 
that equal / almost everywhere. 

Simple applications of the definition and the properties proved previ¬ 
ously yield all the elementary properties of the integral: 

Proposition 1.11 The integral of Lebesgue integrable functions is lin¬ 
ear, additive, monotonic, and satisfies the triangle inequality. 

We now gather two results which, although instructive in their own 
right, are also needed in the proof of the next theorem. 

Proposition 1.12 Suppose f is integrable on M. d . Then for every e > 0; 

(i) There exists a set of finite measure B (a ball, for example) such 


that 



(ii) There is a S > 0 such that 



whenever m(E) < 5. 
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The last condition is known as absolute continuity. 

Proof. By replacing / with |/| we may assume without loss of gener¬ 
ality that / > 0. 

For the first part, let Bjy denote the ball of radius N centered at the 
origin, and note that if = f(x)xB N {x), then > 0 is measur¬ 
able, /iv ⑻幺 /iv+i ⑷， and lim^v^oo = f(x). By the monotone 

convergence theorem, we must have 



In particular, for some large TV, 




For the second part, assuming again that / > 0, we let /at(^) = f(x)XE N 
where 


E N = {x : f(x) < N}. 


Once again, /at > 0 is measurable, ^ /iv+i ⑷， and given e > 0 

there exists (by the monotone convergence theorem) an integer N > 0 
such that 



We now pick 5 > 0 so that N5 < e/2. If m(E) < 5 ， then 





In) In 


This concludes the proof of the proposition. 


Intuitively, integrable functions should in some sense vanish at infinity 
since their integrals are finite, and the first part of the proposition at¬ 
taches a precise meaning to this intuition. One should observe, however, 
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that integrability need not guarantee the more naive point wise vanishing 
as \x\ becomes large. See Exercise 6. 

We are now ready to prove a cornerstone of the theory of Lebesgue 
integration, the dominated convergence theorem. It can be viewed as a 
culmination of our efforts, and is a general statement about the interplay 
between limits and integrals. 

Theorem 1.13 Suppose {/ n } is a sequence of measurable functions such 
that fn(x) —>• f(x) a.e. x, as n tends to infinity. If \f n (x)\ < g(x), where 
g is integrable, then 

J l/n - /I ^ 0 as oo, 

and consequently 

^ /n ^ J f as oo. 

Proof. For each N > 0 let = {x : \x\ < AT, g(x) < N}. Given 

e > 0, we may argue as in the first part of the previous lemma, to see 
that there exists N so that f EC g < e. Then the functions f n XE N are 
bounded (by N) and supported on a set of finite measure, so that by the 
bounded convergence theorem, we have 



|/n — /| < ^, 


for all large n. 


Hence, we obtain the estimate 



-/i 


^ [ |/n — /| + 2 / g 

J e n J 


< e + 2e = 3e 


for all large n. This proves the theorem. 


Complex-valued functions 

If / is a comp lex-valued function on we may write it as 


f(x) = u(x) + iv(x) 
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where u and v are real-valued functions called the real and imaginary 
parts of /, respectively. The function / is measurable if and only if both u 
and v are measurable. We then say that / is Lebesgue integrable if the 
function |/(x)| = (u(x) 2 + v(x) 2 ) 1 ^ 2 (which is non-negative) is Lebesgue 
integrable in the sense defined previously. 

It is clear that 


k(x)| < \f(x)\ and \v(x)\ < \f(x)\. 
Also, if a, 6 > 0, one has (a + 6) 1/，2 < a 1 / 2 + b 1 , 2 , so that 

\f(x)\ < |咖)| + k ㈤ |. 


As a result of these simple inequalities, we deduce that a complex-valued 
function is integrable if and only if both its real and imaginary parts are 
integrable. Then, the Lebesgue integral of / is defined by 

J f{x) dx = J u(x) dx i J v{x) dx. 

Finally, if 五 is a measurable subset of and / is a complex-valued 
measurable function on E, we say that / is Lebesgue integrable on E if 
fXE is integrable on and we define J E f = J fXE. 

The collection of all complex-valued integrable functions on a mea¬ 
surable subset E cR d forms a vector space over C. Indeed, if / and g 
are integrable, then so is / + 分， since the triangle inequality gives |(/ + 
^)(x)| < |/(x)| + |^(x)|, and monotonicity of the integral then yields 



\f + 9\< 



\g\ < oo. 


Also, it is clear that if a G C and if / is integrable, then so is af. Finally, 
the integral continues to be linear over C. 


2 The space L 1 of integrable functions 

The fact that the integrable functions form a vector space is an impor¬ 
tant observation about the algebraic properties of such functions. A 
fundamental analytic fact is that this vector space is complete in the 
appropriate norm. 
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For any integrable function / on we define the norm 2 of /, 

ll/ll = II/IIli = ||/||Li(M d ) = [ dx - 

JR d 

The collection of all integrable functions with the above norm gives a 
(somewhat imprecise) definition of the space L 1 (]R d ). We also note that 
||/1| = 0 if and only if / = 0 almost everywhere (see Proposition 1.6), 
and this simple property of the norm reflects the practice we have al¬ 
ready adopted not to distinguish two functions that agree almost every¬ 
where. With this in mind, we take the precise definition of L 1 (]R d ) to be 
the space of equivalence classes of integrable functions, where we define 
two functions to be equivalent if they agree almost everywhere. Often, 
however, it is convenient to retain the (imprecise) terminology that an 
element / G L 1 (M d ) is an integrable function, even though it is only an 
equivalence class of such functions. Note that by the above, the norm 
ll/ll of an element / G L 1 (M d ) is well-defined by the choice of any inte¬ 
grable function in its equivalence class. Moreover, L 1 (M d ) inherits the 
property that it is a vector space. This and other straightforward facts 
are summarized in the following proposition. 

Proposition 2.1 Suppose f and g are two functions in L 1 (M d ). 

(i) l|a/|| L i( R <i) = |a| ||/||Li(Erf) for all a G C. 

(ii) J| / + fl'||i 1 (R<*) < 1/IU 1 (R<*) + l|fl , IUi(R d )- 

(iii) ||/||li(i^) = 0 if and only if f ^ 0 a.e. 

(iv) d(f,g) = 11/ - ff|| L i(K- 2 ) defines a metric on 

In (iv), we mean that d satisfies the following conditions. First, d(f : g) > 
0 for all integrable functions / and 仏 and d(/, = 0 if and only if f = g 

a.e. Also, d[f ， g) = d(g, /), and finally, d satisfies the triangle inequality 

d(f,g) < d(f, h) + d(h,g), for all f,g,he L 1 ^). 

A space V with a metric d is said to be complete if for every Cauchy 
sequence {x^} in V (that is, d(xfc, xi) — > 0 as A:, £ —> oo) there exists 
x ^ V such that lim^^oo Xk = x in the sense that 

d(Xk, x) —> 0, as fc —> oo. 

Our main goal of completing the space of Riemann integrable functions 
will be attained once we have established the next important theorem. 


2 In this chapter the only norm we consider is the I/ 1 -norm, so we often write ||/|| for 
Il/H^i. Later, we shall have occasion to consider other norms, and then we shall modify 
our notation accordingly. 
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Theorem 2.2 (Riesz-Fischer) The vector space L 1 is complete in its 
metric. 


Proof. Suppose {/ n } is a Cauchy sequence in the norm, so that ||/ n — 
/m|| —^ 0 as n, m —>• oo. The plan of the proof is to extract a subsequence 
of {/n} that converges to /, both pointwise almost everywhere and in 
the norm. 

Under ideal circumstances we would have that the sequence {f n } con¬ 
verges almost everywhere to a limit /, and we would then prove that the 
sequence converges to / also in the norm. Unfortunately, almost every¬ 
where convergence does not hold for general Cauchy sequences (see Exer¬ 
cise 12). The main point, however, is that if the convergence in the norm 
is rapid enough, then almost everywhere convergence is a consequence, 
and this can be achieved by dealing with an appropriate subsequence of 
the original sequence. 

Indeed, consider a subsequence {/ nfe }^ =1 of {/ n } with the following 
property: 

Il/n fe+1 -/nJI < 2- fe , for all k > 1. 

The existence of such a subsequence is guaranteed by the fact that ||/ n — 
fm\\ <e whenever n,m > N(e), so that it suffices to take = N{2~ k ). 

We now consider the series whose convergence will be seen below, 


f(x) - f ni (x)4r [(D) - fn k (x)) 
k=l 

and 

oo 

g(x ) 二 \f ni (x)\ + ^|/n fc+ 1 (^)-/n fe (x)|, 
k=l 


and note that 



l/nj + 




^ / I/nil + < OO. 

J k=l 


So the monotone convergence theorem implies that g is integrable, and 
since I/| < g, hence so is /. In particular, the series defining / converges 
almost everywhere, and since the partial sums of this series are precisely 
the f nk (by construction of the telescopic series), we find that 


fn k (x) f(x) a.e. x. 
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To prove that f nk —^ / in L 1 as well, we simply observe that \f — f nk \ < g 
for all k, and apply the dominated convergence theorem to get ||/ nfe — 
IWl 1 —^ 0 as A; tends to infinity. 

Finally, the last step of the proof consists in recalling that {f n } is 
Cauchy. Given e, there exists N such that for all n,m > N we have 
ll/n _ /mil < e/2. If n k is chosen so that n k > N, and ||/n fc - f\\ < e/2, 
then the triangle inequality implies 

||/n - /|| < ||/n - fn k \\ + \\fn k ~ f\\ < ^ 

whenever n > N. Thus {/ n } has the limit / in L 1 , and the proof of the 
theorem is complete. 

Since every sequence that converges in the norm is a Cauchy sequence 
in that norm, the argument in the proof of the theorem yields the fol¬ 
lowing. 

Corollary 2.3 If {f n }h converges to f in L 1 ，then there exists a sub¬ 
sequence {fn k }kLi such that 

fn» — f(x) a.e. x. 

We say that a family Q of integrable functions is dense in L 1 if for any 
/ G L 1 and e > 0, there exists g ^ Q so that \\f — g\\ L i < e. Fortunately 
we are familiar with many families that are dense in L 1 , and we describe 
some in the theorem that follows. These are useful when one is faced 
with the problem of proving some fact or identity involving integrable 
functions. In this situation a general principle applies: the result is often 
easier to prove for a more restrictive class of functions (like the ones in 
the theorem below), and then a density (or limiting) argument yields the 
result in general. 

Theorem 2.4 The following families of functions are dense in L 1 (]R d ); 

(i) The simple functions. 

(ii) The step functions. 

(iii) The continuous functions of compact support. 

Proof. Let / be an integrable function on ~R d . First, we may assume 
that / is real-valued, because we may approximate its real and imaginary 
parts independently. If this is the case, we may then write / = /+ — /—, 
where /+, f~ > 0, and it now suffices to prove the theorem when / > 0. 
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and with g linear on the intervals [a — e, a] and [6, 6 + e]. Then \\f — 
gW^i < 2e. In d dimensions, it suffices to note that the characteristic 
function of a rectangle is the product of characteristic functions of inter¬ 
vals. Then, the desired continuous function of compact support is simply 
the product of functions like g defined above. 

The results above for L 1 (]R d ) lead immediately to an extension in which 
R d can be replaced by any fixed subset E of positive measure. In fact 
if E is such a subset, we can define L 1 (S) and carry out the arguments 
that are analogous to Better yet, we can proceed by extending 

any function f on E by setting f = f on E and / = 0 on E c , and defining 
ll/IU 1 ( 五） =II/IIl 1 ^)- The analogues of Proposition 2.1 and Theorem 2.2 
then hold for the space L 1 (E). 


For (i), Theorem 4.1 in Chapter 1 guarantees the existence of a se¬ 
quence {^pk} of non-negative simple functions that increase to / point- 
wise. By the dominated convergence theorem (or even simply the mono¬ 
tone convergence theorem) we then have 

11/ - o as fc ^ oo. 

Thus there are simple functions that are arbitrarily close to / in the L 1 
norm. 

For (ii), we first note that by (i) it suffices to approximate simple 
functions by step functions. Then, we recall that a simple function is 
a finite linear combination of characteristic functions of sets of finite 
measure, so it suffices to show that if E is such a set, then there is a 
step function ^ so that \\xe ~ ^Wl 1 is small. However, we now recall 
that this argument was already carried out in the proof of Theorem 4.3, 
Chapter 1. Indeed, there it is shown that there is an almost disjoint 
family of rectangles {Rj} with m(EA Rj) S 2e. Thus xe and ^ = 
XRj differ at most on a set of measure 2e, and as a result we find 
that \\xe - ^IIl 1 < 2e. 

By (ii), it suffices to establish (iii) when / is the characteristic function 
of a rectangle. In the one-dimensional case, where / is the characteristic 
function of an interval [a, 6], we may choose a continuous piecewise linear 
function g defined by 


>- 


<- - 
X a 
VI VI 
a X 
ifif 
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Invariance Properties 

If / is a function defined on the translation of / by a vector /i G 
is the function 九， defined by fh(x) = f(x — h). Here we want to examine 
some basic aspects of translations of integrable functions. 

First, there is the translation-invariance of the integral. One way to 
state this is as follows: if / is an integrable function, then so is fh and 

(4) f(x — h) dx = / f{x) dx. 

jR d JR d 


We check this assertion first when / = xe, the characteristic function 
of a measurable set E. Then obviously fh = XE h , where Eh = {x h : 
x G _B}, and thus the assertion follows because m(Eh) = m(E) (see Sec¬ 
tion 3 in Chapter 1). As a result of linearity, the identity (4) holds for 
all simple functions. Now if / is non-negative and {p n } is a sequence of 
simple functions that increase pointwise a.e to / (such a sequence exists 
by Theorem 4.1 in the previous chapter), then {{^p n )h} is a sequence of 
simple functions that increase to fh pointwise a.e, and the monotone con¬ 
vergence theorem implies (4) in this special case. Thus, if / is complex¬ 
valued and integrable we see that f Rd \f(x — h)\dx = f Rd |/(x)| dx, which 
shows that fh G L 1 (M d ) and also \\fh\\ = ||/||. From the definitions, we 
then conclude that (4) holds whenever / G L 1 . 

Incidentally, using the relative invariance of Lebesgue measure under 
dilations and reflections (Section 3, Chapter 1) one can prove in the same 
way that if f(x) is integrable, so is f(5x), 5 > 0, and /(—$), and 
⑸ 

5 d / f(Sx) dx = f(x)dx, while / /(—x) dx = f(x) dx. 

JR d JR d JR d JR d 


We digress to record for later use two useful consequences of the above 
invariance properties: 

(i) Suppose that / and g are a pair of measurable functions on M. d so 
that for some fixed x G the function y i—^ f{x — y)g(y) is integrable. 
As a consequence, the function y i— f(y)g(x — y) is then also integrable 
and we have 


⑹ 



f(x- y)g(y)dy 



- y) dy. 


This follows from (4) and (5) on making the change of variables which 
replaces y by x _ y, and noting that this change is a combination of a 
translation and a reflection. 
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The integral on the left-hand side is denoted by (/ * g)(x) and is de¬ 
fined as the convolution of / and g. Thus (6) asserts the commutativity 
of the convolution product. 


(ii) Using (5) one has that for all e > 0 


⑺ 

/ 

dx 

= 

= e~ a+d ( 

dx 

nr 

whenever a > d, 


A x \> e 

\x\ a 


\x\ a 


and 






⑻ 

[ 

dx 

|—[T = 

= e~ a+d ( 

dx 

nr 

whenever a < d. 


A x \< e 

\x\ a 


\x\ a 



It can also be seen that the integrals t ^ x | >1 and J| a; | <1 靡： (respec¬ 
tively, when a > d and a < d) are finite by the argument that appears 
after Corollary 1.10. 


Translations and continuity 


We shall next examine how continuity properties of / are related to the 
way the translations fh vary with h. Note that for any given x G the 
statement that fh(x) f(x) as ft 0 is the same as the continuity of 
/at the point x. 

However, a general / which is integrable may be discontinuous at ev¬ 
ery x, even when corrected on a set of measure zero; see Exercise 15. 
Nevertheless, there is an overall continuity that an arbitrary / G L 1 (M d ) 
enjoys, one that holds in the norm. 

Proposition 2.5 Suppose f G L 1 (M d ). Then 

\\fh — /Hl 1 — 0 as h ^ 0. 


The proof is a simple consequence of the approximation of integrable 
functions by continuous functions of compact support as given in The¬ 
orem 2.4. In fact for any e > 0, we can find such a function g so that 
||/- 分 || < e_ Now 

fh - f = (gh — g) {fh — gh) — (/ — g)- 

However, ||A — ^|| = \\f — g\\ < e, while since g is continuous and has 
compact support we have that clearly 


hh-g\\ 


/ \g(x -h) - g(x)\ dx-^0 

JR d 


as ft —> 0. 


So if \h\ < <5, where 5 is sufficiently small, then \\gh — g\\ < e, and as a 
result \\fh — /|| < 3e, whenever \h\ < 5. 
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3 Fubini’s theorem 

In elementary calculus integrals of continuous functions of several vari¬ 
ables are often calculated by iterating one-dimensional integrals. We 
shall now examine this important analytic device from the general point 
of view of Lebesgue integration in and we shall see that a number of 
interesting issues arise. 

In general, we may write as a product 

M. d = M dl x M d2 where d = di + 办 ，and ^ 1,^2 ^ 1- 

A point in then takes the form (x, y), where x G M dl and y G M^ 2 . 
With such a decomposition of in mind, the general notion of a slice, 
formed by fixing one variable, becomes natural. If / is a function in 
M. dl x M d2 , the slice of / corresponding to y E is the function f y of 
the x G M dl variable, given by 

尸 ⑷ =/(A 2/). 

Similarly, the slice of / for a fixed x G M dl is f x (y) = f(x,y). 

In the case of a set E C M dl x we define its slices by 

E y = {x E M. dl : (x, y) G E} and E x = {y ^ : (x, y) G E}. 

See Figure 1 for an illustration. 



Figure 1. Slices E y and E x (for fixed x and y) of a set E 


3.1 Statement and proof of the theorem 

That the theorem that follows is not entirely straightforward is clear 
from the first difficulty that arises in its formulation, involving the mea¬ 
surability of the functions and sets in question. In fact, even with the 
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assumption that / is measurable on it is not necessarily true that 
the slice f y is measurable on M dl for each y\ nor does the corresponding 
assertion necessarily hold for a measurable set: the slice E y may not 
be measurable for each y. An easy example arises in M 2 by placing a 
one-dimensional non-measurable set on the x-axis; the set E in R 2 has 
measure zero, but E y is not measurable for y = 0. What saves us is that, 
nevertheless, measurability holds for almost all slices. 

The main theorem is as follows. We recall that by definition all inte- 
grable functions are measurable. 


Theorem 3.1 Suppose f(x ， y) is integrable on ]R dl x . Then for al¬ 
most every y G M^ 2 : 

(i) The slice f y is integrable on M dl . 

(ii) The function defined by J Rdl f y (x) dx is integrable on M^ 2 . 


Moreover: 




/ f(x ， y)dx 、 
/R d i / 


dy = 



/. 


Clearly, the theorem is symmetric in x and y so that we also may conclude 
that the slice f x is integrable on R d2 for a.e. x. Moreover, f Rd2 f x (y) dy 
is integrable, and 




f[x,y)dy 


dx = 



/• 


In particular, Fubini’s theorem states that the integral of / on can 


be computed by iterating lower-dimensional integrals, and that the iter¬ 
ations can be taken in any order 



/M d i 


f(x,y)dx 



dx = 



/• 


We first note that we may assume that / is real-valued, since the 
theorem then applies to the real and imaginary parts of a complex-valued 
function. The proof of Fubini’s theorem which we give next consists of a 
sequence of six steps. We begin by letting T denote the set of integrable 
functions on which satisfy all three conclusions in the theorem, and 
set out to prove that C T. 

We proceed by first showing that T is closed under operations such 
as linear combinations (Step 1) and limits (Step 2). Then we begin to 
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construct families of functions in T. Since any integrable function is the 
“limit” of simple functions, and simple functions are themselves linear 
combinations of sets of finite measure, the goal quickly becomes to prove 
that xe belongs to T whenever 五 is a measurable subset of R d with 
finite measure. To achieve this goal, we begin with rectangles and work 
our way up to sets of type Gs (Step 3), and sets of measure zero (Step 4). 
Finally, a limiting argument shows that all integrable functions are in T. 
This will complete the proof of Fubini’s theorem. 


Step 1. Any finite linear combination of functions in T also belongs 
to T. 

Indeed, let {fk}^ =1 C T. For each k there exists a set C M d2 of 
measure 0 so that is integrable on M. dl whenever y ^ A^. Then, if 
A = (J^L 1 Ak, the set A has measure 0, and in the complement of A, 
the ?/-slice corresponding to any finite linear combination of the fk is 
measurable, and also integrable. By linearity of the integral, we then 
conclude that any linear combination of the belongs to T. 


Step 2. Suppose {fk} is a sequence of measurable functions in T so 
that fk/"forfk\ /, where / is integrable (on R d ). Then f E J 7 . 

By taking —fk instead of fk if necessary, we note that it suffices to 
consider the case of an increasing sequence. Also, we may replace fk 
by fk — fi and assume that the fk’s are non-negative. Now, we observe 
that an application of the monotone convergence theorem (Corollary 1.9) 
yields 

(9) lim / fk(x,y)dxdy= I f(x,y)dxdy. 

fc ^°° JR d JR d 

By assumption, for each k there exists a set C M^ 2 , so that is 
integrable on R dl whenever y ^ A^. li A = !J=i then m{A) = 0 in 
M^ 2 , and ii y ^ A, then is integrable on M, dl for all /c, and, by the 
monotone convergence theorem, we find that 


9k(y) 



fk( x ) dx 


increases to a limit 


g{v) = / f y (x)dx 

JR d i 


as k tends to infinity. By assumption, each gk{y) is integrable, so that 
another application of the monotone convergence theorem yields 


( 10 ) 



9 k{y) dy 



g(y) dy 


as fc —> oo. 
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By the assumption that G we have 



9k(y) dy : 


/R d 


f k (x,y)dxdy ， 


and combining this fact with (9) and (10), we conclude that 




f(x,y)dxdy. 


Since / is integrable, the right-hand integral is finite, and this proves that 
g is integrable. Consequently g{y) < oo a.e. hence f y is integrable for 
a.e. and 



/R d i 


f(x,y) 


dx) 


dy = 



f(x,y)dxdy. 


This proves that f ^ J 7 as desired. 

Step 3. Any characteristic function of a set E that is a Gs and of finite 
measure belongs to T. 

We proceed in stages of increasing order of generality. 

(a) First suppose 五 is a bounded open cube in such that E = Qi x 
Q 2 , where Qi and Q 2 are open cubes in R dl and R^ 2 , respectively. Then, 
for each y the function is measurable in x, and integrable with 


g(y )= 



XE(x,y)dx 


\Qi 

0 


if y e Q 2 , 

otherwise. 


Consequently, g = \Q 1 \xQ 2 1S also measurable and integrable, with 



9(y)dy =\Qi\ IQ 2 I. 


Since we initially have f Rd Xe(x, y) dx dy = \E\ = \Q\\ IQ 2 I? we deduce 
that \e 三罗 . 

(b) Now suppose E 1 is a subset of the boundary of some closed cube. 
Then, since the boundary of a cube has measure 0 in we have 
/ Rd XE{x,y) dxdy = 0. 

Next, we note, after an investigation of the various possibilities, that 
for almost every 7 /, the slice E y has measure 0 in R dl , and therefore if 
g(y) = J Rdl y) dx we have g{y) = 0 for a.e. y. As a consequence, 
f Rd2 g{y) dy = 0, and therefore xe ^ 
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(c) Suppose now 五 is a finite union of closed cubes whose interiors are 
disjoint, E = UfLi Qk. Then, if Qk denotes the interior of Qk, we may 
write xe as a linear combination of the Xg fe and \A k where Ak is a 
subset of the boundary of Qk for k = 1, … ， if. By our previous analysis, 
we know that XQk an d XA k belong to T for all /c, and since Step 1 
guarantees that T is closed under finite linear combinations, we conclude 
that xe G T^ as desired. 


(d) Next, we prove that if E is open and of finite measure, then xe ^ 
T. This follows from taking a limit in the previous case. Indeed, by 
Theorem 1.4 in Chapter 1, we may write 五 as a countable union of 
almost disjoint closed cubes 


五二 U 

j=i 

Consequently, if we let 九 =XQj ? then we note that the functions 
fk increase to / = xe-, which is integrable since m(E) is finite. Therefore, 
we may conclude by Step 2 that f E T. 

(e) Finally, if 五 is a Gs of finite measure, then xe ^ F. Indeed, by 
definition, there exist open sets 0i, O 2 , …， such that 

E^f]O k . 

k=l 


Since E has finite measure, there exists an open set Oq of finite measure 
with E d Oq. If we let 

k 

Ofe-Oonfl^-, 

j=l 

then we note that we have a decreasing sequence of open sets of finite 
measure Oi D O 2 D • • ■ with 


E^f]O k . 

k=l 

Therefore, the sequence of functions fk = Xo k decreases to / = xe, and 
since xo k ^ ^ for all k by (d) above, we conclude by Step 2 that xe 
belongs to T. 

Step 4. If E has measure 0, then \e belongs to T. 
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Indeed, since E is measurable, we may choose a set G of type Gs with 
E C G and m{G) = 0 (Corollary 3.5, Chapter 1). Since \G ^ (by the 
previous step) we find that 


Therefore 




Xc{x,y) dx 


dy = 



XG = 0 . 


/ Xg($, y) dx = 0 for a.e. y. 

JR d i 


Consequently, the slice G y has measure 0 for a.e. y. The simple obser¬ 
vation that E y C G y then shows that E y has measure 0 for a.e. and 
/ Mdl y) dx = 0 for a.e. y. Therefore, 



and thus \e G T^ as was to be shown. 

Step 5. If E is any measurable subset of with finite measure, then 
Xe belongs to T. 

To prove this, recall first that there exists a set of finite measure G of 
type Gs ： with E C G and m(G — E) = 0. Since 


Xe = XG — Xg-e, 

and T is closed under linear combinations, we find that G as 
desired. 

Step 6. This is the final step, which consists of proving that if / is 
integrable, then f E J 7 . 

We note first that / has the decomposition / = /+ — /_, where both /+ 
and f~ are non-negative and integrable, so by Step 1 we may assume 
that / is itself non-negative. By Theorem 4.1 in the previous chapter, 
there exists a sequence { 仰 } of simple functions that increase to /. Since 
each ipk is a finite linear combination of characteristic functions of sets 
with finite measure, we have ipk ^ J 7 by Steps 5 and 1, hence / G J 7 by 
Step 2. 

3.2 Applications of Fubini’s theorem 

Theorem 3.2 Suppose /(x, y) is a non-negative measurable function on 
M. dl x M^ 2 . Then for almost every y G M^ 2 : 
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(i) The slice f y is measurable on M dl . 

(ii) The function defined by f y {x) dx is measurable on M^ 2 . 


Moreover: 



R d i 


f(x,y)dx 



/(x, y) dx dy in the extended sense. 


In practice, this theorem is often used in conjunction with Fubini’s 
theorem. 3 Indeed, suppose we are given a measurable function / on ~R d 
and asked to compute f Rd f. To justify the use of iterated integration, we 
first apply the present theorem to |/|. Using it, we may freely compute 
(or estimate) the iterated integrals of the non-negative function |/|. If 
these are finite, Theorem 3.2 guarantees that / is integrable, that is, 
J |/| < oo. Then the hypothesis in Fubini’s theorem is verified, and we 
may use that theorem in the calculation of the integral of /. 


Proof of Theorem 3.2. Consider the truncations 


綱 = { /( 7) 


if \(x,y)\ < k and f{x,y) < k, 
otherwise. 


Each fk is integrable, and by part (i) in Fubini’s theorem there exists a 
set Ek C IR^ 2 of measure 0 such that the slice (x) is measurable for all 
y G E^. Then, if we set E = Ek, we find that f y (x) is measurable for 
all y ^ E c and all k. Moreover, m(E) = 0. Since / f y , the monotone 
convergence theorem implies that ii y 丰 E, then 

/ fk(x,y)dx / f(x,y)dx as fc — oo. 
jR d i JR d i 

Again by Fubini’s theorem, J Rdl /^(x, y) dx is measurable for all y G E c , 
hence so is J Rdl f(x,y)dx. Another application of the monotone conver¬ 
gence theorem then gives 


(11) / ( [ f k (x,y)dx\ dy^ f ( [ f(x,y)dx\ dy. 

jR d 2 \jR d i / JR d 2 \jR d i / 

By part (iii) in Fubini’s theorem we know that 


( 12 ) 


/ / fk(x,y)dx \dy^ f k . 

/R d 2 \JR d i / JR d 


3 Theorem 3.2 was formulated by Tonelli. We will, however, use the short-hand of 
referring to it, as well as Theorem 3.1 and Corollary 3.3, as Fubini’s theorem. 
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A final application of the monotone convergence theorem directly to fk 
also gives 

(13) / [ /• 

jR d JR d 


Combining (11), (12), and (13) completes the proof of Theorem 3.2. 


Corollary 3.3 If E is a measurable set in M dl x M d2 ，then for almost 
every y G M^ 2 the slice 

E y = {xe R dl : {x,y) G E} 

is a measurable subset of M dl . Moreover m(E y ) is a measurable function 
of y and 


m(E)= 



m(E y ) dy. 


This is an immediate consequence of the first part of Theorem 3.2 applied 
to the function xe- Clearly a symmetric result holds for the slices in 
R d2 . 

We have thus established the basic fact that if E is measurable on 
M, dl x M^ 2 , then for almost every y G IR^ 2 the slice E y is measurable in 
M. dl (and also the symmetric statement with the roles of x and y inter¬ 
changed) . One might be tempted to think that the converse assertion 
holds. To see that this is not the case, note that if we let Af denote a 
non-measurable subset of M, and then define 


E = [0,1] x J\f C M x IR, 

we see that 

_ J [0,1] if 2/ G A/*, 

\ 0 ii y 丰 N• 

Thus E y is measurable for every y. However, if E were measurable, then 
the corollary would imply that 私 ={y G R : (x^y) G E} is measurable 
for almost every x G M, which is not true since E x is equal to J\f for all 
x E [0,1]. 

A more striking example is that of a set E in the unit square [0,1] x 
[0,1] that is not measurable, and yet the slices E y and E x are measurable 
with m(E y ) = 0 and m(E x ) = 1 for each x,?/ G [0,1]. The construction 
of E is based on the existence of a highly paradoxical ordering -< of 
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the reals, with the property that {x : x -< y} is a countable set for each 
y eR. (The construction of this ordering is discussed in Problem 5.) 
Given this ordering we let 

E 二 {(x,y) e [0,1] x [0,1], with x -<y}. 

Note that for each y E [0, 1], E y = {x : x -< y}\ thus E y is countable and 
m(E y ) = 0. Similarly m(E x ) = 1, because E x is the complement of a 
denumerable set in [0,1]. If E were measurable, it would contradict the 
formula in Corollary 3.3. 

In relating a set E to its slices E x and E y , matters are straightforward 
for the basic sets which arise when we consider M. d as the product x 
M^ 2 . These are the product sets E = Ei x 五 2 , where Ej C 

Proposition 3.4 If E = E\ x E 2 is a measurable subset of M. d , and 
m* ( 五 2 ) > 0 ， then E\ is measurable. 

Proof. By Corollary 3.3, we know that for a.e. y G M^ 2 , the slice 
function 

(Xe iX e 2 ?(x ) 二 XeA^XeAv) 

is measurable as a function of x. In fact, we claim that there is some 
y ^ E 2 such that the above slice function is measurable in x; for such a 
y we would have Xe 1 xE 2 { x ^u) = Xe 1 { x )'> and this would imply that E\ 
is measurable. 

To prove the existence of such a we use the assumption that m* ( 五 2 ) > 
0. Indeed, let F denote the set of y G M^ 2 such that the slice E y is 
measurable. Then m(F c ) = 0 (by the previous corollary). However, 
E 2 C\ F is not empty because H F) > 0. To see this, note that 

E 2 = (E 2 fl F) (J ( 丑 2 A F c ), hence 

0 < m* ( 五 2 ) S m*(E2 fl F) + m*(E2 (1 F c ) = m*(£ ， 2 fl F), 

because E 2 Pi F c is a subset of a set of measure zero. 

To deal with a converse of the above result, we need the following 
lemma. 

Lemma 3.5 If E\ C M dl and E 2 C then 

m^(Ei x E 2 ) < m* ( 五 1 ) m*(_ 02 )， 

with the understanding that if one of the sets Ej has exterior measure 
zero, then m^(Ei x E 2 ) = 0. 
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Proof. Let e > 0. By definition, we can find cubes {Qk}kLi in R dl 
and in R d2 such that 

oo oo 

c 1J Qk, and E 2 c\jQ' e 

k=l i=l 

and 

oo oo 

^ \Qk\ < m^Ei) + e and ^ \Q^\ < m„(E 2 ) + e. 

k=l £=1 

Since Ei x E 2 C |J^=i Qk 乂 Q ’。 the sub-additivity of the exterior mea¬ 
sure yields 

00 

x E 2 ) < ^2 \Qk x Q'e\ 

k/=l 

阽 I) 

< + e)(m* ( 五 2 ) + e). 

If neither E\ nor E 2 has exterior measure 0, then from the above we find 

?77-*(£^1 x E 2 ) < m^(E 1 ) m* ( 五 2 ) + 0(e), 

and since e is arbitrary, we must have m^(Ei x E 2 ) < m^(Ei) m*(_E 2 ). 

If for instance m* ( 五 1 ) = 0, consider for each positive integer j the 
set E 3 2 = E2 r\ {y E M^ 2 : \y\ < j}. Then, by the above argument, we 
find that m^(Ei x E 3 2 ) = 0. Since ( 五 1 x E^) / (Ei x E2) as j —>• 00 , we 
conclude that m* ( 五 1 x E 2 ) = 0. 

Proposition 3.6 Suppose Ei and E 2 are measurable subsets ofM. dl and 
]R d 2 ， respectively. Then E = Ei x E 2 is a measurable subset ofW 1 . More¬ 
over, 

m(E) = 772 (^ 1 ) m(£ , 2 ), 

with the understanding that if one of the sets Ej has measure zero, then 
m(E) = 0. 



Proof. It suffices to prove that E is measurable, because then the 
assertion about m(E) follows from Corollary 3.3. Since each set Ej is 
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measurable, there exist sets Gj C R dj of type Gs, with Gj D Ej and 
m^(Gj — Ej) = 0 for each j = 1,2. (See Corollary 3.5 in Chapter 1.) 
Clearly, G = Gi x G 2 is measurable in M dl x M d2 and 

(Gi x G 2 ) - (E 1 x E 2 ) c ((Gi - E ± ) x G 2 ) U (Gi x (G 2 - E 2 )) • 

By the lemma we conclude that m^{G — E) = 0 , hence E is measurable. 


As a consequence of this proposition we have the following. 

Corollary 3.7 Suppose f is a measurable function on M. dl . Then the 
function f defined by f(x,y) = f(x) is measurable on R dl x M^ 2 . 

Proof. To see this, we may assume that / is real-valued, and recall 
first that if a G M and Ei = {x G M dl : f(x) < a}, then E\ is measurable 
by definition. Since 

{(x, y) G M dl x R^ 2 : f{x^y) < a} = Ei x 

the previous proposition shows that {/(x, y) < a} is measurable for each 
a G M. Thus f(x,y) is a measurable function on x M^ 2 , as desired. 

Finally, we return to an interpretation of the integral that arose first in 
the calculus. We have in mind the notion that f f describes the “area” 
under the graph of /. Here we relate this to the Lebesgue integral and 
show how it extends to our more general context. 

Corollary 3.8 Suppose f(x) is a non-negative function on M d ， and let 

A = {(x, y) eR d xR: 0 <y < 


Then: 


(i) f is measurable on M. d if and only if A is measurable in 

(ii) If the conditions in (i) hold, then 



f(x) dx = m(A). 


Proof. If / is measurable on then the previous proposition guar¬ 
antees that the function 


F(x,y) = y- f(x) 
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is measurable on so we immediately see that ^4 = {y > 0} D {F < 

0} is measurable. 

Conversely, suppose that A is measurable. We note that for each 
x G R dl the slice = {?/ G M : (x,y) G ^1} is a closed segment, namely 
A x = [0, f(x)]. Consequently Corollary 3.3 (with the roles of x and y 
interchanged) yields the measurability of m(A x ) = /(x). Moreover 

m(A) = / XA(x ， y)dxdy= m(A x ) dx = / f(x)dx, 

J JR d i JR d i 

as was to be shown. 

We conclude this section with a useful result. 


Proposition 3.9 If f is a measurable function on M. d , then the function 
f(x,y) = f(x — y) is measurable on x 

By picking E = {z eR d : f(z) < a}, we see that it suffices to prove 
that whenever 五 is a measurable subset of then E = {(x,y) : x — y E 
E} is a measurable subset of x M. d . 

Note first that if O is an open set, then O is also open. Taking count¬ 
able intersections shows that if ^ is a Gs set, then so is E. Assume 
now that m(Ek) = 0 for each fc, where = E C\ and = {\y\ < k}. 
Again, take O to be open in and let us calculate m(0 fl Bk). We 
have that XdnB k = Xo(x - y)XB k (y). Hence 

m(d n B k ) = j xo(x - y)xB k (y) dydx 

= /(/ XO<yX ~ V ^ dx ) XBk ^ dy 
= m(0) m(B k ), 


by the translation-invariance of the measure. Now if m(E) = 0, there is 
a sequence of open sets O n such that E C O n and m{O n ) —>• 0. It follows 
from the above that Ek C O n fl and m((9 n fl B^) 0 in n for each 

fixed k. This shows m(Ek) = 0, and hence m(E) = 0. The proof of the 
proposition is concluded once we recall that any measurable set E can 
be written as the difference of a Gs and a set of measure zero. 

4* A Fourier inversion formula 


The question of the inversion of the Fourier transform encompasses in 
effect the problem at the origin of Fourier analysis. This issue involves 
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establishing the validity of the inversion formula for a function / in terms 
of its Fourier transform /, that is, 


(14) 

m- 

f f(x)e~ 2nix< dx 



JR d 

(15) 

f(x) = 

f f(0e 2nix< d^. 

JR d 


We have already encountered this problem in Book I in the rudimen¬ 
tary case when in fact both / and / were continuous and had rapid (or 
moderate) decrease at infinity. In Book II we also considered the ques¬ 
tion in the one-dimensional setting, seen from the viewpoint of complex 
analysis. The most elegant and useful formulations of Fourier inversion 
are in terms of the L 2 theory, or in its greatest generality stated in the 
language of distributions. We shall take up these matters systematically 
later. 4 It will, nevertheless, be enlightening to digress here to see what 
our knowledge at this stage teaches us about this problem. We intend to 
do this by presenting a variant of the inversion formula appropriate for 
L 1 , one that is both simple and adequate in many circumstances. 

To begin with, we need to have an idea of what can be said about the 
Fourier transform of an arbitrary function in L 1 (]R rf ). 

Proposition 4.1 Suppose f G L 1 (IR d ). Then f defined by (14) is con¬ 
tinuous and bounded on M. d . 

In fact, since \f(x)e~ 27rlx '^\ = |/ ⑷ |， the integral representing / con¬ 
verges for each ( and sup ㈣ d |/(^)| < |/(x)| dx = ||/||: To verify the 

continuity, note that for every x, f(x)e~ 27rlx '^ f(x)e~ 27Tlx '^° as ^ > (o, 

where & is any point in hence /(^) ^ f(^o) by the dominated con¬ 
vergence theorem. 

One can assert a little more than the boundedness of /; namely, one 
has / ⑹ —> 0 as |^| —> cxd, but not much more can be said about the 
decrease at infinity of /. (See Exercises 22 and 25.) As a consequence, 
for general / G L 1 (M d ) the function / is not in L 1 (R d ), and the presumed 
formula (15) becomes problematical. The following theorem evades this 
difficulty and yet is useful in a number of situations. 

Theorem 4.2 Suppose f G L 1 (M d ) and assume also that f G L 1 (M d ). 
Then the inversion formula (15) holds for almost every x. 

An immediate corollary is the uniqueness of the Fourier transform 
on L 1 . 


4 The L 2 theory will be dealt with in Chapter 5, and distributions will be studied in 
Book IV. 
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Corollary 4.3 Suppose /(C) = 0 for all Then f = 0 a.e. 

The proof of the theorem requires only that we adapt the earlier argu¬ 
ments carried out for Schwartz functions in Chapter 5 of Book I to the 
present context. We begin with the “multiplication formula.” 

Lemma 4.4 Suppose f and g belong to L 1 (R d ). Then 



/( ⑽)成 



f(y)g(y) dy. 


Note that both integrals converge in view of the proposition above. Con¬ 
sider the function F(^, y) = g(^)f{y)e~ 27rt ^ y defined for (^, y) G x 
M. d = R 2d . It is measurable as a function on M. 2d in view of Corollary 3.7. 
We now apply Fubini’s theorem to observe first that 



\F^y)\didy 


lR d 


\g(0\ d ^ / \f{y)\dy < oo. 


fR d 


Next, if we evaluate f Rd f Rd F(^, y) d^, dy by writing it as f Rd (f Rd F(^,y)d^) dy 
we get the left-hand side of the desired equality. Evaluating the double 
integral in the reverse order gives as the right-hand side, proving the 
lemma. 

Next we consider the modulated Gaussian, = e _7r<5 ^l e 27rzx '^, where 
for the moment 6 and x are fixed, with <5 > 0 and x G An elementary 
calculation gives 5 


a(y) 


/R d 


e ~Tr8\^\ 2 e 27ri(x-y)-^ ^ _ ^-d/2 e -^\x-y\ 2 /8 ^ 


which we will abbreviate as K§[x — y). We recognize Ks as a “good 
kernel” that satisfies: 

(i) [ K s (y) dy= 1. 
jR d 

(ii) For each 77 > 0, / Ks(y) dy ^ 0 as S ^ 0. 

J\y\>v 

Applying the lemma gives 


(16) 


/R d 


/ ⑹ e - 娜 1 e 2nix< 狀 


/R d 


f(y)K 6 {x-y) dy. 


5 See for example Chapter 6 in Book I. 
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Note that since / G L 1 (M d ), the dominated convergence theorem shows 
that the left-hand side of (16) converges to f Rd f(£,)e 27rlx '^ as 8 ^ 0, for 
each x. As for the right-hand side, we make two successive change of vari¬ 
ables y ^ y x (a translation), and y —>• —y (a reflection), and take into 
account the corresponding invariance of the integrals (see equations (4) 
and (5)). Thus the right-hand side becomes f Rd f(x — y)Ks(y) dy, and 
we will prove that this function converges in the Z^-norm to / as 5 —^ 0. 
Indeed, we can write the difference as 

A s{x) = [ f(x-y)K s {y)dy-f(x) = [ (f(x - y) - f(x))K s (y) dy, 

jR d JR d 

because of property (i) above. Thus 

|A 5 (x)| < [ \f{x-y) - f(x)\K s {y) dy. 

JR d 

We can now apply Fubini’s theorem, recalling that the measurability 
of f(x) and f (x — y) on M. d x are established in Corollary 3.7 and 
Proposition 3.9. The result is 

lA^H < [ \\f y - f\\K g (y)dy, where f y {x) ^ f(x-y). 

JR d 

Now, for given e > 0 we can find (by Proposition 2.5) r] > 0 so small such 
that \\f y — f\\ < e when \y\ < rj. Thus 

||A 5 || < e+ / \\fy~ f\\K s (y)dy<e + 2\\f\\ f K s (y) dy. 

The first inequality follows by using (i) again; the second holds because 
\\f y - /|| < \\fy\\ + ll/ll = 2||/||. Therefore, with the use of (ii), the com¬ 
bination above is < 2e if 5 is sufficiently small. To summarize: the right- 
hand side of (16) converges to / in the L 1 -norm as 5^-0, and thus 
by Corollary 2.3 there is a subsequence that converges to f(x) almost 
everywhere, and the theorem is proved. 

Note that an immediate consequence of the theorem and the proposi¬ 
tion is that if / were in L 1 , then / could be modified on a set of measure 
zero to become continuous everywhere. This is of course impossible for 
the general / G L 1 (M d ). 

5 Exercises 

1. Given a collection of sets Fi, F 2 ,..., F n , construct another collection Fi, F 2 ,..., F^- 
with iV = 2 n — 1 ， so that |J^ =1 Fk = U^Li ^7 ； the collection {Fj} is disjoint; also 
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Fk = {J Fj *cF k F h for every /c. 

[Hint: Consider the 2 n sets Pi 巧 fl... Pi K where each is either Fk or F^.] 

2. In analogy to Proposition 2.5, prove that if / is integrable on and 5 > 0, 
then f(Sx) converges to f(x) in the I^-norm as <5 — 1. 

3. Suppose / is integrable on (—7r, tt] and extended to R by making it periodic of 
period 2n. Show that 


f(x)dx, 

where I is any interval in M of length 2n. 

[Hint: I is contained in two consecutive intervals of the form (kn, (k + 2) 丌 ).] 


f(x)dx ■ 


4. Suppose / is integrable on [0,6], and 



for 0 < x < b. 


Prove that g is integrable on [0, b] and 


g(x) dx ■ 




5. Suppose F is a closed set in R, whose complement has finite measure, and let 
(5(x) denote the distance from x to F, that is, 

= d(x, F) = inf{|x — y\ ： y E F}. 


Consider 


J(x)= 



\x-y\ 2 


dy. 


(a) Prove that J is continuous, by showing that it satisfies the Lipschitz condi¬ 
tion 

\8(x)-8(y)\ < \x-y\. 


(b) Show that I (pc) = oo for each x 牟 F. 

(c) Show that I{x) < oo for a.e. x G F. This may be surprising in view of the 

fact that the Lispshitz condition cancels only one power of \x — y\ in the 

integrand of I. 
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[Hint: For the last part, investigate J F I(x) dx.] 

6. Integrability of / on R does not necessarily imply the convergence of f(x) to 0 
as a: ^ oo. 

(a) There exists a positive continuous function / on R so that / is integrable 
on M, but yet limsupa^^ f(x) = oo. 

(b) However, if we assume that / is uniformly continuous on IR and integrable, 
then lim^i^oo f(x) = 0. 

[Hint: For (a), construct a continuous version of the function equal to n on the 
segment [n, n + 1/n 3 ), n > 1.] 

7. Let r C x R, r = {(a;, y) G x IR : y = /(x)}, and assume / is measurable 
on R d . Show that r is a measurable subset of M d+1 , and m(T) = 0. 

8. If / is integrable on R, show that F(x) = fit) dt is uniformly continuous. 

9. Tchebychev inequality. Suppose / > 0, and / is integrable. If a > 0 and 
E a = {x : f(x) > a}, prove that 

m(E a ) < - f f. 


10 . Suppose / > 0, and let E 2 k = {x : f(x) > 2 k } and Fk = {x : 2 k < f(x) < 
2 fc+1 }. If / is finite almost everywhere, then 

oo 

u ^ = {/(*) > 0}, 

k=—oo 

and the sets Fk are disjoint. 

Prove that / is integrable if and only if 

oo oo 

< oo, if and only if 2 k m(E 2 k) < oo. 

k= — oo k= — oo 

Use this result to verify the following assertions. Let 

x f |a:| _a if |工| < 1, ，、 f \x\~ b if \x\ > 1, 

L 0 otherwise, v [ ◦ otherwise. 

Then / is integrable on if and only if a < d; also g is integrable on M. d if and 
only if b > d. 


11. Prove that if / is integrable on R d , real-valued, and J E f(x) dx > 0 for ev¬ 
ery measurable E, then f(x) > 0 a.e. x. As a result, if f E f(x) dx = 0 for every 
measurable E, then f(x) = 0 a.e. 
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12. Show that there are / G L 1 ^^) and a sequence {/ n } with f n G I/ 1 (IR <i ) such 
that 

11/ — /nlL 1 — 0， 


but f n (x) f(x) for no x. 

[Hint: In R, let f n = xi n , where I n is an appropriately chosen sequence of intervals 
with m(In) 0.] 

13. Give an example of two measurable sets A and B such that A-\- B is not 
measurable. 

[Hint: In R 2 take A = {0} x [0,1] and B = Af x {0}.] 


14. In Exercise 6 of the previous chapter we saw that m(B) = Vdr d , whenever B 
is a ball of radius r in and Vd = m(Bi), with B\ the unit ball. Here we evaluate 
the constant Vd. 


(a) For d = 2, prove using Corollary 3.8 that 


V2 = ^ 



— x 2 ) 1 ^ 2 dx 


and hence by elementary calculus, that V 2 = 7r. 

(b) By similar methods, show that 


Vd = ^Vd-I / (1 - x 2 )^ -1 ^ 2 dx. 

Jo 

(c) The result is 

TT ^ 2 

Vd= T(d/2 + l)' 

Another derivation is in Exercise 5 in Chapter 6 below. Relevant facts about the 
gamma and beta functions can be found in Chapter 6 of Book II. 


15. Consider the function defined over R by 



x -" 2 

0 


if 0 < x < 1, 
otherwise. 


For a fixed enumeration {rn^^Li of the rationals Q, let 


F{x) = ^ j 2~ n f{x-r n ). 
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Prove that F is integrable, hence the series defining F converges for almost every 
a; G M. However, observe that this series is unbounded on every interval, and in 
fact, any function F that agrees with F a.e is unbounded in any interval. 

16. Suppose / is integrable on IR d . If 5 = (<5i,..., 5d) is a tuple of non-zero real 
numbers, and 


f S (x) = f(5x) = S d x d ), 


show that f s is integrable with 



IT. Suppose / is defined on R 2 as follows: f(x, y) = a n \i n < x < n-\-1 and n < 
y < n + 1, (n > 0); f(x, y) = —a n < x <n-\-l and n + 1 < t / < n + 2, (n > 0); 
while f(x, y) = 0 elsewhere. Here a n = ^2 k<n bk, with {6a ；} a positive sequence 
such that yZk^-o bk = s < oo. 

(a) Verify that each slice f y and f x is integrable. Also for all x, f f x (y) dy = 0, 
and hence J ( / f(x,y) dy) dx = 0. 

(b) However, f f v (x) dx = ao if 0 < y < 1, and f f y (x) dx = a n — a n -i if n < 
y < n-\-1 with n > 1. Hence y ^ f f y (x) dx is integrable on (0, oo) and 



(c) Note that f RxR \f{pc,y)\dxdy = oo. 


18. Let / be a measurable finite-valued function on [0,1], and suppose that \f(x) — 
f(y)\ is integrable on [0,1] x [0,1]. Show that f(x) is integrable on [0,1]. 

19. Suppose / is integrable on IR d . For each a > 0, let E a = {x : \f(x)\ > a}. 
Prove that 



20. The problem (highlighted in the discussion preceding Fubini’s theorem) that 
certain slices of measurable sets can be non-measurable may be avoided by re¬ 
stricting attention to Borel measurable functions and Borel sets. In fact, prove the 
following: 

Suppose E is a Borel set in IR 2 . Then for every y, the slice E y is a Borel set in 


94 


Chapter 2. INTEGRATION THEORY 


[Hint: Consider the collection C of subsets E of R 2 with the property that each 
slice E y is a Borel set in R. Verify that C is a cr-algebra that contains the open 
sets.] 

21. Suppose that / and g are measurable functions on IR d . 

(a) Prove that f(x — y)g(y) is measurable on IR 2d . 

(b) Show that if / and g are integrable on then f(x — y)g{y) is integrable 
on R 2d . 

(c) Recall the definition of the convolution of / and g given by 

(/* 9 )(x) = / f(x- y)g(y) dy. 

JR d 

Show that / * p is well defined for a.e. x (that is, f(x — y)g{y) is integrable 
on IR d for a.e. x). 

(d) Show that / * p is integrable whenever / and g are integrable, and that 

11/ * 5 , lli/ i (E d ) ^ II/IIl 1 ^) IIpIL 1 ^)? 

with equality if / and g are non-negative. 

(e) The Fourier transform of an integrable function / is defined by 

f(0= [ f(x)e~ 27rix< dx. 
jR d 

Check that / is bounded and is a continuous function of Prove that for 
each ^ one has 

(T^)(o = ho 认 o. 

22. Prove that if / G Z/ 1 (R <i ) and 

f(0 = [ f(x)e~ 27rix ^ dx, 

JR d 

then / ⑹ 一 0 as 1— oo. (This is the Riemann-Lebesgue lemma.) 

[Hint: Write / ⑹ =§ / M d[/ ⑷ —/0 — ^)]e~ 2nix ^ dx, where ^ = | 命 ， and use 
Proposition 2.5.] 

23. As an application of the Fourier transform, show that there does not exist a 
function I G L 1 (R d ) such that 


/*/ = / for all / G L 1 ^). 
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24. Consider the convolution 

(/* 9)(x) = / f(x- y)g(y) dy. 

jR d 

(a) Show that f 氺 g is uniformly continuous when / is integrable and g bounded. 

(b) If in addition g is integrable, prove that (/ * g)(x) 0 as |a:| ^ oo. 

25. Show that for each e > 0 the function F(^) = ( 1+ &| 2 ) e is the Fourier transform 
of an L 1 function. 

[Hint: With K 6 (x) = e -^ 2/s 5- d/2 consider f(x) = Ks{x)e~' KS d e - 1 dS. Use 
Fubini’s theorem to prove / G L 1 (R d ), and 

/(f)= e-^e-^S^dd, 

Jo 

and evaluate the last integral as 7r~ e r(e) ( 1+ &| 2 ) e . Here r(5) is the gamma function 
defined by T(s) = / 0 °° e _t t s_1 dt.] 

6 Problems 

1. If / is integrable on [0, 2n], then J Q 27r f(x)e~ zrix cte — 0 as 卜 I —■ oo. 

Show as a consequence that if 五 is a measurable subset of [0, 2 丌 ]， then 

[cos^nx + u^dx^ 771 ^, asn^oo 
J E 2 

for any sequence {u n }. 

[Hint: See Exercise 22.] 

2. Prove the Cantor-Lebesgue theorem: if 

oo oo 

n cos nx + b n sin nx) 

n=0 n=0 

converges for a: in a set of positive measure (or in particular for all x), then a n ^ 0 
and 6 n — 0 as n — oo. 

[Hint: Note that A n (x) 0 uniformly on a set E of positive measure.] 

3. A sequence {fk} of measurable functions on is Cauchy in measure if for 
every e > 0, 


m{{x : \fk(x) — f^{x)\ 〉 e}) — 0 a,s k,£ ^ oo. 
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We say that {fk} converges in measure to a (measurable) function / if for every 
e > 0 

m({x : \fk(x) — f(x)\ > e}) — 0 as /c ^ oo. 

This notion coincides with the “convergence in probability” of probability theory. 

Prove that if a sequence {fk} of integrable functions converges to / in L 1 , then 
{fk} converges to / in measure. Is the converse true? 

We remark that this mode of convergence appears naturally in the proof of 
Egorov’s theorem. 


4. We have already seen (in Exercise 8, Chapter 1) that if 五 is a measurable set 
in IR d , and L is a linear transformation of IR d to R d , then L(E) is also measurable, 
and if E has measure 0, then so has L(E). The quantitative statement is 


m{L{E)) = I det(L)| m(E). 

As a special case, note that the Lebesgue measure is invariant under rotations. 
(For this special case see also Exercise 26 in the next chapter.) 

The above identity can be proved using Fubini’s theorem as follows. 

(a) Consider first the case d = 2, and L a “strictly” upper triangular transfor¬ 
mation x r = x -\- ay, y f = y . Then 

XL(E)(x,y) = xe(L _1 0,2/)) = XE(x-ay ， y). 


Hence 


m(L ⑹） 


XE[x- ay,y) ] dy 


JRxR ' 
m ⑹, 


XE(x ， y)dx) dy 


by the translation-invariance of the measure. 

(b) Similarly m(L(E)) = m(E) if L is strictly lower triangular. In general, one 
can write L = L 1 AL 2 , where Lj are strictly (upper and lower) triangular 
and A is diagonal. Thus m(L(E)) = \ det(L)\m(E), if one uses Exercise 7 
in Chapter 1. 


5. There is an ordering - < of IR with the property that for each y G M the set 
{工 G M : x y} is at most countable. 

The existence of this ordering depends on the continuum hypothesis, which 
asserts: whenever S is an infinite subset of R, then either S is countable, or S has 
the cardinality of R (that is, can be mapped bijectively to R). 6 


6 This assertion, formulated by Cantor, is like the well-ordering principle independent 
of the other axioms of set theory, and so we are also free to accept its validity. 



6. Problems 


97 


[Hint: Let denote a well-ordering of R, and define the set X by X = {y £ 
]R : the set {x : x ^ y} is not countable}. If X is empty we are done. Otherwise, 
consider the smallest element y in X, and use the continuum hypothesis.] 



Differentiation and Integration 


The Maximal Problem: 


The problem is most easily grasped when stated 
in the language of cricket, or any other game in which 
a player compiles a series of scores of which an average 
is recorded. 


G. H. Hardy and J. E. Littlewood, 1930 


That differentiation and integration are inverse operations was already 
understood early in the study of the calculus. Here we want to reexamine 
this basic idea in the framework of the general theory studied in the 
previous chapters. Our objective is the formulation and proof of the 
fundamental theorem of the calculus in this setting, and the development 
of some of the concepts that occur. We shall try to achieve this by 
answering two questions, each expressing one of the ways of representing 
the reciprocity between differentiation and integration. 

The first problem involved may be stated as follows. 

• Suppose / is integrable on [a, b] and F is its indefinite integral 
F(x) = J a f(y) dy. Does this imply that F is differentiable (at 
least for almost every x), and that F，= f 1 

We shall see that the affirmative answer to this question depends 
on ideas that have broad application and are not limited to the one¬ 
dimensional situation. 

For the second question we reverse the order of differentiation and 
integration. 

• What conditions on a function F on [a, b] guarantee that F\x) ex¬ 
ists (for a.e. x), that this function is integrable, and that moreover 



While this problem will be examined from a narrower perspective than 
the first, the issues it raises are deep and the consequences entailed are 


1. Differentiation of the integral 


99 


far-reaching. In particular, we shall find that this question is connected 
to the problem of rectifiability of curves, and as an illustration of this 
link, we shall establish the general isoperimetric inequality in the plane. 

1 Differentiation of the integral 

We begin with the first problem, that is, the study of differentiation of 
the integral. If / is given on [a, b] and integrable on that interval, we let 



To deal with F\x)^ we recall the definition of the derivative as the limit 
of the quotient 


F(x + ") - F(x) 
h 


when h tends to 0. 


We note that this quotient takes the form (say in the case h > 0) 



where we use the notation I = (x,x -h h) and \I\ for the length of this 
interval. At this point, we pause to observe that the above expression 
is the “average” value of / over 7", and that in the limit as |/| ^ 0, 
we might expect that these averages tend to f(x). Reformulating the 
question slightly, we may ask whether 



holds for suitable points x. In higher dimensions we can pose a similar 
question, where the averages of / are taken over appropriate sets that 
generalize the intervals in one dimension. Initially we shall study this 
problem where the sets involved are the balls B containing x, with their 
volume m(B) replacing the length \I\ of I. Later we shall see that as a 
consequence of this special case similar results will hold for more general 
collections of sets, those that have bounded “eccentricity.” 

With this in mind we restate our first problem in the context of ]R d , 
for all d > 1. 
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Suppose / is integrable on R d . Is it true that 





/(x), for a.e. xl 


The limit is taken as the volume of open balls B containing 
x goes to 0. 

We shall refer to this question as the averaging problem. We remark 
that if B is any ball of radius r in then m(B) = Vdr d , where Vd is 
the measure of the unit ball. (See Exercise 14 in the previous chapter.) 

Note of course that in the special case when / is continuous at x , the 
limit does converge to /(x). Indeed, given e > 0, there exists 5 > 0 such 
that |/(x) — f(y)\ < e whenever \x — y\ < S. Since 



we find that whenever B is a ball of radius < 5/2 that contains x, then 



as desired. 

The averaging problem has an affirmative answer, but to establish that 
fact, which is qualitative in nature, we need to make some quantitative 
estimates bearing on the overall behavior of the averages of /. This will 
be done in terms of the maximal averages of |/|, to which we now turn. 

1.1 The Hardy-Littlewood maximal function 

The maximal function that we consider below arose first in the one¬ 
dimensional situation treated by Hardy and Littlewood. It seems that 
they were led to the study of this function by toying with the question 
of how a batsman’s score in cricket may best be distributed to maximize 
his satisfaction. As it turns out, the concepts involved have a universal 
significance in analysis. The relevant definition is as follows. 

If / is integrable on we define its maximal function /* by 



where the supremum is taken over all balls containing the point x. In 
other words, we replace the limit in the statement of the averaging prob¬ 
lem by a supremum, and / by its absolute value. 
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The main properties of /* we shall need are summarized in a theorem. 
Theorem 1.1 Suppose f is integrable on M. d . Then: 

(i) /* is measurable. 

(ii) /* ⑻ < oo for a.e. x. 

(iii) /* satisfies 


(1) m({x G R d : /*(x) > a}) < 


II/IIl 1 ^) 


for all a> 0, where A = 3 d , and ||/||_Li(Rd) = |/(x)| dx. 

Before we come to the proof we want to clarify the nature of the main 
conclusion (iii). As we shall observe, one has that /*(x) > |/(x)| for a.e. 
x; the effect of (iii) is that, broadly speaking, /* is not much larger than 
I/|. From this point of view, we would have liked to conclude that /* is 
integrable, as a result of the assumed integrability of /. However, this 
is not the case, and (iii) is the best substitute available (see Exercises 4 
and 5). 

An inequality of the type (1) is called a weak-type inequality be¬ 
cause it is weaker than the corresponding inequality for the L 1 -norms. 
Indeed, this can be seen from the Tchebychev inequality (Exercise 9 in 
Chapter 2), which states that for an arbitrary integrable function 



We should add that the exact value of A in the inequality (1) is unim¬ 
portant for us. What matters is that this constant be independent of a 
and /. 

The only simple assertion in the theorem is that /* is a measurable 
function. Indeed, the set E a = {x E R d : /*(x) > a} is open, because if 
x G E a , there exists a ball B such that x E B and 



Now any point x close enough to x will also belong to hence x G 
as well. 

The two other properties of /* in the theorem are deeper, with (ii) 
being a consequence of (iii). This follows at once if we observe that 


{x : /*(x) = oo} C {x : /*(x) > a} 
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for all a. Taking the limit as a tends to infinity, the third property yields 
m({x : /*(x) = oo}) = 0. 

The proof of inequality (1) relies on an elementary version of a Vitali 
covering argument. 1 

Lemma 1.2 Suppose B = {Bi, B 2 ,Bn} is a finite collection of open 
balls in Then there exists a disjoint sub-collection , Bi 2 ,..., Bi k 
of B that satisfies 

( N \ k 

U -B^l < 3 d y^rn(B ij ). 

i=\ / 3=1 


Loosely speaking, we may always find a disjoint sub-collection of balls 
that covers a fraction of the region covered by the original collection of 
balls. 


Proof. The argument we give is constructive and relies on the fol¬ 
lowing simple observation: Suppose B and B’ are a pair of balls that 
intersect, with the radius of B’ being not greater than that of B. Then 
B' is contained in the ball B that is concentric with B but with 3 times 
its radius. 

As a first step, we pick a ball in B with maximal (that is, largest) 
radius, and then delete from B the ball Bi x as well as any balls that 
intersect . Thus all the balls that are deleted are contained in the 
ball Bi x concentric with Bi” but with 3 times its radius. 

The remaining balls yield a new collection B f , for which we repeat the 
procedure. We pick Bi 2 with largest radius in and then delete from 
B , the ball Bi 2 and any ball that intersects Bi 2 . Continuing this way we 
find, after at most N steps, a collection of disjoint balls Bi” Bi 2 ,..., Bi k . 

Finally, to prove that this disjoint collection of balls satisfies the in¬ 
equality in the lemma, we use the observation made at the beginning of 
the proof. We let Bi j denote the ball concentric with By, but with 3 
times its radius. Since any ball B in B must intersect a ball and have 
equal or smaller radius than Bi j , we must have B C Bi j , thus 

( N \ / k \ k k 

1J Bi I < m ( l^J I < 

£=1 ) \j=l / 3=1 3=1 


x We note that the lemma that follows is the first of a series of covering arguments that 
occur below in the theory of differentiation; see also Lemma 3.9 and its corollary, as well 
as Lemma 3.5, where the covering assertion is more implicit. 
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Figure 1. The balls B and B 


In the last step we have used the fact that in M. d a dilation of a set by 
5 > 0 results in the multiplication by 5 d of the Lebesgue measure of this 
set. 


The proof of (iii) in Theorem 1.1 is now in reach. If we let E a = {x : 
/*($)> a}, then for each x G E a there exists a ball B x that contains x, 
and such that 


1 

m(B x ) 



\f(y)\dy > a. 


Therefore, for each ball B x we have 


⑵ 


m { B x) <ljjf {y) \ d y. 


Fix a compact subset K of E a . Since K is covered by B x , we 

may select a finite subcover of K, say K C [J【i 极 . The covering lemma 
guarantees the existence of a sub-collection Bi” …， of disjoint balls 
with 

( N \ k 

i=\ ) 3=1 
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Since the balls Bi 1 ， … ， Bi k are disjoint and satisfy (2) as well as (3), we 
find that 


m(K) < m 



< 


3 d ^m^.) 


< 


3 d 


a 


3 d 



\f{y)\dy 


Bi- 


< 


3 d r 

'R d 


a 


\f(y)\dy 
\f(y)\ dy. 


Since this inequality is true for all compact subsets K of E a , the proof 
of the weak type inequality for the maximal operator is complete. 


1.2 The Lebesgue differentiation theorem 

The estimate obtained for the maximal function now leads to a solution 
of the averaging problem. 

Theorem 1.3 If f is integrable on then 

(4) lim —^ - [ f(y)dy = f{x) for a.e. x. 

Proof. It suffices to show that for each a > 0 the set 


E a = < x : lim sup 


a(B) — 0 
x ^ B 


m{B) J B 


f(y) dy - f{x) 


> 2a 


has measure zero, because this assertion then guarantees that the set 
E = [J^L 1 Ei/ n has measure zero, and the limit in (4) holds at all points 
oiE c . 

We fix a, and recall Theorem 2.4 in Chapter 2, which states that for 
each e > 0 we may select a continuous function g of compact support with 
11/ — 5 , llL 1 (R d ) < e - As we remarked earlier, the continuity of g implies 
that 

' g(y)dy = g{x), for all x. 


lim , 

n(s) — o m{B) J B ’ 


Since we may write the difference f B f(y) dy — f(x) as 


m(B) J B 


(f(y) - g(y))dy + 


m{B) J B 


g(y) dy — g(x) + g(x) - f(x) 
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we find that 


lim sup 


m(S) ― > 0 
x ^ B 


1 

m(B) 



f[y)dy — f(x) 


< (/ - g)*{x) + \g(x) - f(x)\, 


where the symbol * indicates the maximal function. Consequently, if 

F a = {x : (/ - gT{x) > a} and G a = {x : \f(x) - g(x)\ > a} 

then E a C (F a U G a ), because if u\ and are positive, then u\-\-U 2 > 
2a only if Ui > a for at least one U{. On the one hand, Tchebychev 5 s 
inequality yields 

^(G a ) < — 11 /- 

and on the other hand, the weak type estimate for the maximal function 
gives 

^ — 11/ — 

The function g was selected so that \\f — < 已 Hence we get 

/r , x A 1 

UlyEf^j ^ — 6 ~h — 6. 

— a a 

Since e is arbitrary, we must have m(E a ) = 0, and the proof of the the¬ 
orem is complete. 

Note that as an immediate consequence of the theorem applied to |/|, 
we see that /*(x) > \f(x)\ for a.e. x, with /* the maximal function. 

We have worked so far under the assumption that / is integrable. This 
“global” assumption is slightly out of place in the context of a “local” 
notion like differentiability. Indeed, the limit in Lebesgue’s theorem is 
taken over balls that shrink to the point x, so the behavior of / far from 
x is irrelevant. Thus, we expect the result to remain valid if we simply 
assume integrability of / on every ball. 

To make this precise, we say that a measurable function / on 
is locally integrable, if for every ball B the function /(x)xs(^) is 
integrable. We shall denote by Ll oc (R d ) the space of all locally integrable 
functions. Loosely speaking, the behavior at infinity does not affect the 
local integrability of a function. For example, the functions and 
|x| -1 / 2 are both locally integrable, but not integrable on R d . 

Clearly, the conclusion of the last theorem holds under the weaker 
assumption that / is locally integrable. 
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Theorem 1.4 // / G Ll oc (M. d ), then 



Our first application of this theorem yields an interesting insight into 
the nature of measurable sets. If 五 is a measurable set and x G we 
say that x is a point of Lebesgue density of E if 


m(B fl E) 

□l - —— — 

— o m{B) 


lim 

m(S) — i 


Loosely speaking, this condition says that small balls around x are almost 
entirely covered by E. More precisely, for every a < 1 close to 1, and 
every ball of sufficiently small radius containing x, we have 


m(B D E) > am(B) • 


Thus E covers at least a proportion a of B. 

An application of Theorem 1.4 to the characteristic function of E im¬ 
mediately yields the following: 

Corollary 1.5 Suppose E is a measurable subset ofM. d . Then: 

(i) Almost every x ^ E is a point of density of E. 

(ii) Almost every x ^ E is not a point of density of E. 

We next consider a notion that for integrable functions serves as a useful 
substitute for pointwise continuity. 

If / is locally integrable on M d , the Lebesgue set of / consists of all 
points x G for which f(x) is finite and 



At this stage, two simple observations about this definition are in order. 
First, x belongs to the Lebesgue set of / whenever / is continuous at x. 
Second, if x is in the Lebesgue set of /, then 



Corollary 1.6 If f is locally integrable on M d ，then almost every point 
belongs to the Lebesgue set of f. 
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Proof. An application of Theorem 1.4 to the function \f(y) — r\ shows 
that for each rational r, there exists a set E r of measure zero, such that 

lim j / |/(y) -r\dy^ \ f(x) - r\ whenever x ^ E r . 

苫 o m{n) J B 

li E = |J rG Q E r , then m(E) = 0. Now suppose that x ^ E and f(x) is 
finite. Given e > 0, there exists a rational r such that |/(x) — r\ < e. 
Since 

—[ \ f(y)~ f(^)\dy< [ \f{y)-r\dy+\f{x)-rl 

饥 Jb m \ B ) Jb 

we must have 

lim sup —^ - [ \f(y) — f(x)\dy < 2e, 

m(B) —>■ o TTlylJ) J^ 
x e b 

and thus x is in the Lebesgue set of /. The corollary is therefore proved. 

Remark. Recall from the definition in Section 2 of Chapter 2 that 
elements of L 1 (R d ) are actually equivalence classes, with two functions 
being equivalent if they differ on a set of measure zero. It is interesting 
to observe that the set of points where the averages (4) converge to a 
limit is independent of the representation of / chosen, because 



whenever / and g are equivalent. Nevertheless, the Lebesgue set of / 
depends on the particular representative of / that we consider. 

We shall see that the Lebesgue set of a function enjoys a universal 
property in that at its points the function can be recovered by a wide 
variety of averages. We will prove this both for averages over sets that 
generalize balls, and in the setting of approximations to the identity. 
Note that the theory of differentiation developed so far uses averages 
over balls, but as we mentioned earlier, one could ask whether similar 
conclusions hold for other families of sets, such as cubes or rectangles. 
The answer depends in a fundamental way on the geometric properties 
of the family in question. For example, we now show that in the case of 
cubes (and more generally families of sets with bounded “eccentricity ”） 
the above results carry over. However, in the case of the family of all 
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rectangles the existence of the limit almost everywhere and the weak 
type inequality fail (see Problem 8). 

A collection of sets {U a } is said to shrink regularly to x (or has 
bounded eccentricity at x) if there is a constant c > 0 such that for 
each U a there is a ball B with 


x G B, U a C .B, and m(U a ) > cm(B). 


Thus U a is contained in B, but its measure is comparable to the measure 
of B. For example, the set of all open cubes containing x shrink regularly 
to x. However, in with d>2 the collection of all open rectangles 
containing x does not shrink regularly to x. This can be seen if we 
consider very thin rectangles. 

Corollary 1.7 Suppose f is locally integrable on If {U a } shrinks 
regularly to x, then 

lim }rr x [ f(y)dy = 

—2) 心。 rn(U a ) J Ua 

for every point x in the Lebesgue set of f. 


The proof is immediate once we observe that if x E B with U a C B 
and m(Ua) > cm(B), then 


m(U a ) 


I/O/) - f(^)\dy < 


cm(B) 


I/O/) - / ㈤ I 兩 / • 


2 Good kernels and approximations to the identity 

We shall now turn to averages of functions given as convolutions, 2 which 
can be written as 

(f * Ks)(x ) 二 f f(x - y)K s (y) dy. 
jR d 

Here / is a general integrable function, which we keep fixed, while the K§ 
vary over a specific family of functions, referred to as kernels. Expressions 
of this kind arise in many questions (for instance, in the Fourier inversion 
theorem of the previous chapter), and were already discussed in Book I. 

In our initial consideration we called these functions “good kernels” if 
they are integrable and satisfy the following conditions for 5 > 0: 


2 Some basic properties of convolutions are described in Exercise 21 of the previous 
chapter. 






2. Good kernels and approximations to the identity 

(i) / Ks(x) dx = 1. 
jR d 

(ii) / \Ks(x) \ dx < A. 
jR d 

(iii) For every ry 〉 0, 


/ \Ks(x) \ dx ^ 0 as 5 —>• 0. 
J\x\>ri 
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Here A is a constant independent of 5. 

The main use of these kernels was that whenever / is bounded, then 
(/ * Ks)(x) —»• f(x) as 5 —> 0, at every point of continuity of /. To obtain 
a similar conclusion, one also valid at all points of the Lebesgue set 
of /, we need to strengthen somewhat our assumptions on the kernels 
Ks- To reflect this situation we adopt a different terminology and refer 
to the resulting narrower class of kernels as approximations to the 
identity. The assumptions are again that the Ks are integrable and 
satisfy conditions (i) but, instead of (ii) and (iii), we assume: 

(ii’ ） \Ks(x)\ < AS~ d for all 5 > 0. 

(iii’ ） |^(x)| < A5/\x\ d+1 for all 5 > 0 and x G M' 3 * 


We observe that these requirements are stronger and imply the conditions 
in the definition of good kernels. Indeed, we first prove (ii). For that, we 
use the second illustration of Corollary 1.10 in Chapter 2, which gives 

f dx C 

(5) / d x < — for some C > 0 and all e > 0. 

J\x\>€ l X l 十 6 

Then, using the estimates (ii’）and (iii’）when \x\ < 5 and \x\ > 5, re¬ 
spectively, yields 


lR d 


\K s (x)\ dx 


'\x\<5 


|^(a;)| dx + 


'\x\>5 


\K s (x)\ dx 


< A f % + A5 

J\x\<8 0 . 

< A f - {- A n < oo. 




\x\ d+] 


dx 


3 Sometimes the condition (iii / ) is replaced by the requirement \Ks(x)\ < AS e /\x\ d+e 

for some fixed € > 0. However, the special case e = 1 suffices in most circumstances. 
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Finally, the last condition of a good kernel is also verified, since another 
application of (5) gives 


Lj ks{xMx - A 5 L,w^ 


V 


and this last expression tends to 0 as 5 —> 0. 


The term “approximation to the identity” originates in the fact that 
the mapping / i—> / * K§ converges to the identity mapping / /, as 

5 — > 0, in various senses, as we shall see below. It is also connected with 
the following heuristics. Figure 2 pictures a typical approximation to the 
identity: for each 5 > 0, the kernel is supported on the set \x\ < 5 and 
has height 1/25. As 5 tends to 0, this family of kernels converges to the 


1/25 


Figure 2. An approximation to the identity 


so-called unit mass at the origin or Dirac delta “function.” The latter 
is heuristically defined by 


T>(x)= 


oo 

0 


ii x = 0 
if x 7^ 0 


and 


J T>{x) dx = 1. 
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Since each Ks integrates to 1, we may say loosely that 

Ks —> V as 5 —• 0. 

If we think of the convolution / * D as f f(x — y)V{y) dy, the product 
f(x — y)V{y) is 0 except when ? / = 0, and the mass of V is concentrated 
at y = 0, so we may intuitively expect that 

(f*V)(x) ^ f(x)_ 

Thus / * P = /, and V plays the role of the identity for convolutions. 
We should mention that this discussion can be formalized and T> given 
a precise definition either in terms of Lebesgue-Stieltjes measures, which 
we take up in Chapter 6, or in terms of “generalized functions” (that is, 
distributions), which we defer to Book IV. 

We now turn to a series of examples of approximations to the identity. 

Example 1. Suppose p is a non-negative bounded function in that 
is supported on the unit ball \x\ < 1, and such that 

卜： 

JR d 

Then, if we set Ks(x) = 6~ d ip(5~ 1 x) : the family is an approx¬ 

imation to the identity. The simple verification is left to the reader. 
Important special cases are in the next two examples. 


Example 2. The Poisson kernel for the upper half-plane is given by 

y 


Pv(X) :nx 2 + y 2 , 
where the parameter is now 5 = y > 0. 


x G M, 


Example 3. The heat kernel in is defined by 


Tit(x) 


1 


-|x| 2 /4 无 


(47rt) d / 2 

Here t > 0 and we have 5 = t 1 / 2 . Alternatively, we could set 5 = Airt to 
make the notation consistent with the specific usage in Chapter 2. 


Example 4. The Poisson kernel for the disc is 


2tt 


Pr(x) 




1 — 2r cos x + r 2 
0 


if |^| < tt, 

if \x\ > 7T. 
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Here we have 0 < r < 1 and 5 = 1 — r. 


Example 5. The Fejer kernel is defined by 


2tt 


Fn(x )= 


1 sin 2 (A^x/2) 
27rN sin 2 (x/2) 


0 


if |^| < tt, 

if \x\ > 7T, 


where 5 = 1/N. 


We note that Examples 2 through 5 have already appeared in Book I. 

We now turn to a general result about approximations to the identity 
that highlights the role of the Lebesgue set. 

Theorem 2.1 If {^} ( 5>0 is an approximation to the identity and f is 
integrable on M rf ，then 


(/ * Ks){x) f(x) as S ^ 0 

for every x in the Lebesgue set of f. In particular, the limit holds for 
a.e. x. 


Since the integral of each kernel K§ is equal to 1, we may write 
(/ * K 5 )(x) - f(x) = J [f(x -y) - f{x)} K s (y)dy. 
Consequently, 

\(f*K s )(x) - f{x)\ < J \f{x - y) - f(x)\\K s (y)\dy, 

and it now suffices to prove that the right-hand side tends to 0 as 5 goes 
to 0. The argument we give depends on a simple result that we isolate 
in the next lemma. 

Lemma 2.2 Suppose that f is integrable on and that x is a point of 
the Lebesgue set of f. Let 

A(r) = \ |/(x — y) — f(x) \ dy^ whenever r > 0. 

r J\y\<r 

Then A{r) is a continuous function of r > 0, and 

A{r) 0 as r —> 0. 

Moreover, A{r) is bounded, that is, A{r) < M for some M > 0 and all 
r > 0. 
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Proof. The continuity of A{r) follows by invoking the absolute conti¬ 
nuity in Proposition 1.12 of Chapter 2. 

The fact that A{r) tends to 0 as r tends to 0 follows since x belongs 
to the Lebesgue set of /, and the measure of a ball of radius r is Vdr d . 
This and the continuity of A(r) for 0 < r < 1 show that A(r) is bounded 
when 0 < r < 1. To prove that A(r) is bounded for r > 1, note that 

A(r) < 4 / \f(x-y)\dy + ^f \f(x)\dy 
r ^\y\<r r ^\y\<r 

< r _d ll/llil (㈣ ） + 抑|/(>)|， 

and this concludes the proof of the lemma. 

We now return to the proof of the theorem. The key consists in writing 
the integral over as a sum of integrals over annuli as follows: 

1/(^ -y)~ / ⑷ I \ K s{y)\dy= f + 

J\y\<s 



^ 0 J2^8<\y\<2^5 


By using the property (ii’）of the approximation to the identity, the first 
term is estimated by 




\f{x -y) - f(x)\ \K s (y)\dy < 


8 d 


'\y\<s 


\f{x - y) - f{x)\dy 


< cA(S). 


Each term in the sum is estimated similarly, but this time by using 
property (iii’）of approximations to the identity: 


l2 k S<\y\<2 k + 1 8 


\f(x-y) - f(x)\ \K s (y)\dy 


< 


< 


c5 


(2 k 5) d + 1 j\y\< 2 k+lg 

c' f 


2 k (2 k + 1 6) d 


\f(x-y) -f(x)\dy 
\f{x - y) - f{x)\dy 


, |y|<2 fe + 1 (5 


< c'2~ k A{2 k+l 5). 


Putting these estimates together, we find that 

oo 

|(/ * K 5 ){x) - f(x)\ < cA(6) + c'Y, 2~ k A(2 k+1 5). 
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Given e > 0, we first choose N so large that ^2 k>N 2 _/c < e. Then, by 
making 5 sufficiently small, we have by the lemma 

^4(2 fc 5) < e/N, whenever fc = 0,l ， ... ， 7V—1. 

Hence, recalling that A(r) is bounded, we find 

- f(x)\ < Ce 

for all sufficiently small S, and the theorem is proved. 

In addition to this pointwise result, convolutions with approximations 
to the identity also provide convergence in the Z^-norm. 

Theorem 2.3 Suppose that f is integrable on and that {if ( 5 } ( 5 > o 
an approximation to the identity. Then, for each 5 > 0 ， the convolution 

(f * K 5 )[x) = f f(x-y)K s (y)dy 
jR d 

is integrable, and 


11(/ * ^s) — /llz^Rd) — 0 ， as 5 — > 0. 

The proof is merely a repetition in a more general context of the argument 
in the special case where Ks(x) = 5 _d / 2 e _7r 卜 I 2 〆 5 given in Section 4*, 
Chapter 2, and so will not be repeated. 


3 Differentiability of functions 

We now take up the second question raised at the beginning of this 
chapter, that of finding a broad condition on functions F that guarantees 
the identity 

(6) F(b)-F(a)= f F\x) dx. 

J a 

There are two phenomena that make a general formulation of this identity 
problematic. First, because of the existence of non-differentiable func¬ 
tions, 4 the right-hand side of (6) might not be meaningful if we merely 
assumed F was continuous. Second, even if F\x) existed for every x, 
the function F' would not necessarily be (Lebesgue) integrable. (See 
Exercise 12.) 


4 In particular, there are continuous nowhere differentiable functions. See Chapter 4 in 

Book I, or also Chapter 7 below. 
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How do we deal with these difficulties? One way is by limiting ourselves 
to those functions F that arise as indefinite integrals (of integrable func¬ 
tions) .This raises the issue of how to characterize such functions, and 
we approach that problem via the study of a wider class, the functions 
of bounded variation. These functions are closely related to the question 
of rectifiability of curves, and we start by considering this connection. 

3.1 Functions of bounded variation 

Let 7 be a parametrized curve in the plane given by z{t) = (x(t) ， y(t)), 
where a < t < b. Here x(t) and y(t) are continuous real-valued functions 
on [a, b]. The curve 7 is rectifiable if there exists M < 00 such that, for 
any partition a = to <t\ < •■- < tjsf = b of [a^b], 

N 

⑺ 

j=l 

By definition, the length L(j) of the curve is the supremum over all 
partitions of the sum on the left-hand side, that is, 

N 

L( 7 卜 sup I 2 ⑹ - 啦 - i)l. 

a=to<ti<---<tN=b 

3 —丄 

Alternatively, L( 7 ) is the infimum of all M that satisfy (7). Geomet¬ 
rically, the quantity L( 7 ) is obtained by approximating the curve by 
polygonal lines and taking the limit of the length of these polygonal 
lines as the interval [a, b] is partitioned more finely (see the illustration 
in Figure 3). 

Naturally, we may now ask the following questions: What analytic 
condition on x{t) and y{t) guarantees rectifiability of the curve 7 ? In 
particular, must the derivatives of x{t) and y{t) exist? If so, does one 
have the desired formula 


L(7)= 





The answer to the first question leads directly to the class of functions 
of bounded variation, a class that plays a key role in the theory of dif¬ 
ferentiation. 

Suppose F{t) is a complex-valued function defined on [a, 6 ], and a = 
to < ti < •.. < = 6 is a partition of this interval. The variation of F 
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Figure 3. Approximation of a rectifiable curve by polygonal lines 


on this partition is defined by 

N 

Emoi. 

j=i 

The function F is said to be of bounded variation if the variations of 
F over all partitions are bounded, that is, there exists M < oo so that 

N 

• 7=1 

for all partitions a = to < ti < ■ • ■ < = b. In this definition we do not 

assume that F is continuous; however, when applying it to the case of 
curves, we will suppose that F(t) = z(t) = x(t) + iy(t) is continuous. 

We observe that if a partition V given by a = to < ti < ... < tM = b is 
a refinement 5 of a partition V given by a = < … < tN = b, then 

the variation of F on P is greater than or equal to the variation of F on 
V. 

Theorem 3.1 A curve parametrized by (x(t),y(t)), a <t <b, is rectifi¬ 
able if and only if both x{t) and y(t) are of bounded variation. 

The proof is immediate once we observe that if F(t) = x(t) + iy(t), then 
F (tj) - = (x(tj) - x{tj-{))+i (y(tj) - yitj-i )), 

5 We say that a partition V of [a, b] is a refinement of a partition V of [a, b] if every 
point in V also belongs to V. 






3. Differentiability of functions 


117 


and if a and b are real, then \a + ib\ < \a\ + |6| < 2\a + ib\. 

Intuitively, a function of bounded variation cannot oscillate too often 
with amplitudes that are too large. Some examples should help clarify 
this assertion. 

We first fix some terminology. A real-valued function F defined on 
[a, b] is increasing if F(ti) < F(^) whenever a <t\ <t 2 <b. If the 
inequality is strict, we say that F is strictly increasing. 

Example 1. If F is real-valued, monotonic, and bounded, then F is of 
bounded variation. Indeed, if for example F is increasing and bounded 
by M, we see that 

N N 

| 卩 (~) _ 卩 ( 匕 -1)1 = - 

j=i j=i 

= F(b) - F(a) < 2M. 


Example 2. If F is differentiable at every point, and F r is bounded, 
then F is of bounded variation. Indeed, if |^| < M, the mean value 
theorem implies 

\F(x) - F(y)\ < M\x-y\, for all x,y e [a, 6], 
hence 1-^(^) — ^ M(6 — a). (See also Exercise 23.) 

Example 3. Let 

、 f x a sin(x _b ) for 0 < z < 1, 

作卜{ 0 if^o. ■ 

Then F is of bounded variation on [0,1] if and only if a > 6 (Exercise 11). 
Figure 4 illustrates the three cases a > b, a = b, and a < b. 

The next result shows that in some sense the first example above ex¬ 
hausts all functions of bounded variation. For its proof, we need the fol¬ 
lowing definitions. The total variation of / on [a, x] (where a < x < b) 
is defined by 

N 

Tf(CL,X) = sup E m 』， 
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where the sup is over all partitions of [a, x]. The preceding definition 
makes sense if F is comp lex-valued. The succeeding ones require that 
F is real-valued. In the spirit of the first definition, we say that the 
positive variation of F on [a, x] is 

P F {a,x) = sup y^F(tj) - F(^_i), 

(+) 

where the sum is over all j such that F(tj) > ( 心 _i), and the supremum 
is over all partitions of [a, x\. Finally, the negative variation of F on 
[a,x] is defined by 

N F (a,x) =sup ^2 - [^(^) - F ( 心 -i )]， 

(-) 

where the sum is over all j such that F(tj) < F(tj-i), and the supremum 
is over all partitions of [a, x\. 

Lemma 3.2 Suppose F is real-valued and of bounded variation on [a, b ]. 
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Then for all a < x < b one has 

F(x) — F(a) = Pp(a, x) — Nf(cl, x), 

and 

Tp(a,x) = x) + Njr(a^ x). 

Proof. Given e > 0 there exists a partition a = to < ... < = x of 

[a, x], such that 



< e and 

N f - 

- [ -[ 卩⑹ _ 尸 fe-i)] 

(+) 



(-) 


(To see this, it suffices to use the definition to obtain similar estimates 
for Pp and Np with possibly different partitions, and then to consider a 
common refinement of these two partitions.) Since we also note that 

F ( x )- F ( a ) 二 尸⑹ - - E -剛-呢- 1)]， 

(+ ) (-) 

we find that \F(x) — F(a) — [Pp — iV F ]| < 2e, which proves the first iden¬ 
tity. 

For the second identity, we also note that for any partition of a = to < 
...< = x of [a, x] we have 

N 

j =i (+) (-) 

hence Tp < Pf + Nf. Also, the above implies 

~ + [-[卩⑹- ^(^- i )] < T f . 

(+) (-) 

Once again, one can argue using common refinements of partitions in the 
definitions of Pf and Np to deduce the inequality Pf + Np < Tjr, and 
the lemma is proved. 


Theorem 3.3 A real-valued function F on [a, b] is of bounded variation 
if and only if F is the difference of two increasing bounded functions. 
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Proof. Clearly, if F = Fi — F 2 , where each Fj is bounded and in¬ 
creasing, then F is of bounded variation. 

Conversely, suppose F is of bounded variation. Then, we let F\{x )= 
Pi^(a, x) + F(a) and ^(x) = Nf(cl ， x). Clearly, both F\ and F 2 are in¬ 
creasing, of bounded variation, and by the lemma F(x) = -Fi(x) — F 2 (x). 


Observe that as a consequence, a complex-valued function of bounded 
variation is a (complex) linear combination of four increasing functions. 

Returning to the curve 7 parametrized by a continuous function z(t)= 
x(t) + iy(t), we want to make some comment about its associated length 
function. Assuming that the curve is rectifiable, we define L(^4, B) as the 
length of the segment of 7 that arises as the image of those t for which 
A < t < with a < A < B < b. Note that L(A, B) = Tf(A ， B )，where 
F(t) = z(t). We see that 

( 8 ) L(A,C) + L(C ， B) = L(A,B) if A<C <B. 

We also observe that L(A, B) is a continuous function of B (and of 
A). Since it is an increasing function, to prove its continuity in B from 
the left, it suffices to see that for each B and e > 0, we can find B\ < B 
such that L(A, B\) > L(A^ B) — e. We do this by first finding a partition 
A = to < ti < … < tN = B such that the length of the corresponding 
polygonal line is > L(A, B) — e/2. By continuity of the function z(t), 
we can find a Bi, with tjv-i < Bi < B, such that \z(B) — z(Bi)\ < e/2. 
Now for the refined partition to < ti < ... < tjv-i < B\ < B, the length 
of the polygonal line is still > L(A : B) — e/2. Therefore, the length 
for the partition to < h < ... < tjv-i = -Bi is > L(A, B) — e, and thus 
L(A,B 1 )>L(A,B)-e. 

To prove continuity from the right at B, let e 〉 0, pick any C > B, 
and choose a partition B = to < ti < •■- < = C such that L[B, C) — 

e /2 < \ z (tj+i) ~ z (^j)\- By considering a refinement of this par¬ 

tition if necessary, we may assume since z is continuous that \z{ti) — 
z(t 0 )\ < e/2. If we denote Bi = z(ti), then we get 

L(B, C) - e /2 < e /2 + L(B U C). 

Since L(B, B\) + L(Bi ， C) = L(B, C) we have L(B, Bi) < e, and there¬ 
fore L(A : Bi) — L(A, B) < e. 

Note that what we have observed can be re-stated as follows: if a 
function of bounded variation is continuous, then so is its total variation. 

The next result lies at the heart of the theory of differentiation. 
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Theorem 3.4 If F is of bounded variation on [a, b], then F is differen¬ 
tiable almost everywhere. 

In other words, the quotient 


lim 

h — ^0 


F(x + ") - F(x) 
h 


exists for almost every x G [a, b\. By the previous result, it suffices to 
consider the case when F is increasing. In fact, we shall first also assume 
that F is continuous. This makes the argument simpler. As for the 
general case, we leave that till later. (See Section 3.3.) It will then 
be instructive to examine the nature of the possible discontinuities of a 
function of bounded variation, and reduce matters to the case of “jump 
functions.” 

We begin with a nice technical lemma of F. Riesz, which has the effect 
of a covering argument. 

Lemma 3.5 Suppose G is real-valued and continuous on M. Let E be 
the set of points x such that 


G(x h) > G(x) for some h = h x > 0. 


If E is non-empty, then it must be open, and hence can be written as a 
countable disjoint union of open intervals E = (J(afc, bk). If (a^, bk) is a 
finite interval in this union, then 

G(bk) — G(ak) = 0 . 


Proof. Since G is continuous, it is clear that E is open whenever it is 
non-empty and can therefore be written as a disjoint union of countably 
many open intervals (Theorem 1.3 in Chapter 1). If (a^, bk) denotes a 
finite interval in this decomposition, then ak ^ E; therefore we cannot 
have G(bk) > G(ak). We now suppose that G(bk) < G(ak). By continu¬ 
ity, there exists ak < c < bk so that 

\ _ G(a k ) + G{b k ) 

G(c) _ 2 ， 

and in fact we may choose c farthest to the right in the interval (a/c, bk). 
Since c 6 E, there exists d > c such that G{d) > G(c). Since bk ^ -E, we 
must have G(x) < G(bk) for all x > bk\ therefore d < bk. Since G(d) > 
G(c), there exists (by continuity) c’ > d with d < bk and G{c r ) = G(c), 
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which contradicts the fact that c was chosen farthest to the right in 
(a/c, bk). This shows that we must have G(ak) = G(6fc), and the lemma 
is proved. 

Note. This result sometimes carries the name “rising sun lemma” for 
the following reason. If one thinks of the sun rising from the east (at 
the right) with the rays of light parallel to the x-axis, then the points 
(x, G(x)) on the graph of G, with x E ： E, are precisely the points which 
are in the shade; these points appear in bold in Figure 5. 



Figure 5. Rising sun lemma 


A slight modification of the proof of Lemma 3.5 gives: 


Corollary 3.6 Suppose G is real-valued and continuous on a closed in¬ 
terval [a, 6]. If E denotes the set of points x in (a, b) so that G(x h) > 
G{x) for some h > 0, then E is either empty or open. In the latter 
case, it is a disjoint union of countably many intervals (a/c, b^), and 
G(ak) = G(bk)，except possibly when a = ak, in which case we only have 


G(a k ) < G(b k ). 


For the proof of the theorem, we define the quantity 


A^(F)(x) 


F(x + /i) - F(x) 


h 
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We also consider the four Dini numbers at x defined by 
D + (F)(x) = limsup Ah(F)(x) 

h 0 
h > 0 

D + (F)(x) = lim inf Ah(F)(x) 

h 0 
h > 0 

D _ (F)(x) = limsup Ah(F)(x) 

h 0 
h < 0 

D- (F)(x) = lim inf Ah(F)(x). 

h 0 
h < 0 

Clearly, one has < D + and < D—. To prove the theorem it 
suffices to show that 

(i) D + {F){x) < oo for a.e. x, and; 

(ii) D + (F)(x) < D-(F)(x) for a.e. x. 

Indeed, if these results hold, then by applying (ii) to —F(—x) instead of 
F(x) we obtain D~(F)(x) < D + (F)(x) for a.e. x. Therefore 

D + < D- < D~ < < D + < oo for a.e. x. 

Thus all four Dini numbers are finite and equal almost everywhere, hence 
F\x) exists for almost every point x. 

We recall that we assume that F is increasing, bounded, and continu¬ 
ous on [a, 6 ]. For a fixed 7 > 0, let 

E 1 = {x : D + (F)(x) > 7 }. 

First, we assert that E 7 is measurable. (The proof of this simple fact is 
outlined in Exercise 14.) Next, we apply Corollary 3.6 to the function 
G(x) = F(x) — 7 X, and note that we then have C |J fc (afc, bk), where 
F(b k ) - F(a k ) > j(b k - a k ). Consequently, 

m(Ery) < E m((a k ,b k )) 

k 

幺 - F(bk) — F(ak) 

<i(F( 6 )-F(a)). 

7 

Therefore m ( 丑 7 ) —>• 0 as 7 tends to infinity, and since {D + F(x) < 00 } C 
E 1 for all 7 , this proves that D + F(x) < 00 almost everywhere. 
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Having fixed real numbers r and R such that i? > r, we let 

E = {xe [a,6] : D + (F)(x) > R and r> D_(F)(x)}. 

We will have shown D~^(F)(x) < D-(F)(x) almost everywhere once we 
prove that m(E) = 0, since it then suffices to let R and r vary over the 
rationals with R > r. 

To prove that m(E) = 0 we may assume that m(E) > 0 and arrive at 
a contradiction. Because R/r > 1 we can find an open set O such that 
E C O C (a, 6), yet m((D) < m(E) • R/r. 

Now O can be written as |J / n , with I n disjoint open intervals. Fix 
n and apply Corollary 3.6 to the function G(x) = —F(—x) + rx on the 
interval —I n . Reflecting through the origin again yields an open set 
(J A .(a/ c , bk) contained in / n , where the intervals (a^, bk) are disjoint, with 

F(b k ) - F(a k ) < r(b k - a k ). 

However, on each interval (a^, bk) we apply Corollary 3.6, this time to 
G(x) = F(x) — Rx. We thus obtain an open set O n = |J fc - (a/cj, of 
disjoint open intervals (a k j,b k j) with (a k j,b k j) C (a k ,b k ) for every 
and 

F (bk,j) — > R(h,j — 

Then using the fact that F is increasing we find that 

m(O n ) = - a kJ ) < - F(a kJ ) 

k,j k,j 

^ ^ F(b k ) - F(a k ) < — - CL k ) 

k k 

< - 爪 Un). 

Note that O n D E I n , since D + F{x) > R and r > D-F{x) for each 
x ^ E\ oi course, I n D O n . We now sum in n. Therefore 

m(E)= E m(E n / ra ) < m(O n ) < 1 — E m(I n ) = -^m(O) < m(E). 

n n 

The strict inequality gives a contradiction and Theorem 3.4 is proved, at 
least when F is continuous. 

Let us see how far we have come regarding (6) if F is a monotonic 
function. 
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Corollary 3.7 If F is increasing and continuous, then F' exists almost 
everywhere. Moreover F' is measurable, non-negative, and 

F\x) dx < F{b)-F(a). 



In particular, if F is bounded on R, then F' is integrable on M. 
Proof. For n > 1, we consider the quotient 


G n (x )= 


F(x + 1/n) — F(x) 
l/n 


By the previous theorem, we have that G n (x) —>• F\x) for a.e. x, which 
shows in particular that F r is measurable and non-negative. 

We now extend F as a continuous function on all of M. By Fatou’s 
lemma (Lemma 1.7 in Chapter 2) we know that 


F\x) dx < liminf / G n (x) dx. 


To complete the proof, it suffices to note that 

J G n (x) dx = J F{x-\-l/n) dx — J F(x) dx 


i /»6+l/n 


1 f b 


TFnL /n 職—祗 聯 



6+l/n 


F(x)dx 


V~n\ a 


i+l/n 


F{x) dx. 


Since F is continuous, the first and second terms converge to F(b) and 
F(a), respectively, as n goes to infinity, so the proof of the corollary is 
complete. 


We cannot go any farther than the inequality in the corollary if we 
allow all continuous increasing functions, as is shown by the following 
important example. 


The Cantor-Lebesgue function 

The following simple construction yields a continuous function F : [0,1] —>• 
[0,1] that is increasing with F(0) = 0 and F(l) = 1, but F r {x) = 0 al¬ 
most everywhere! Hence F is of bounded variation, but 

F\x) dx^F(b)-F{a). 
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Consider the standard triadic Cantor set C C [0,1] described at the 
end of Section 1 in Chapter 1, and recall that 


C = 门 Cfc， 

k=0 


where each Ck is a disjoint union of 2 k closed intervals. For example, 
Ci = [0,1/3] U [2/3,1]. Let F\{x) be the continuous increasing function 
on [0,1] that satisfies -Fi(O) = 0, F\(x) = 1/2 if 1/3 <x< 2/3, Fi ⑴ =1, 
and Fi is linear on C\. Similarly, let ^(x) be continuous and increasing, 
and such that 


F 2 (x)= < 


0 

1/4 

1/2 

3/4 

1 


if x = 0, 

if l/9<x<2/9, 
if l/3<x<2/3, 
if 7/9 <x< 8/9, 
if x = 1, 


and F 2 is linear on C 2 . See Figure 6. 



Figure 6. Construction of F 2 


This process yields a sequence of continuous increasing functions 
{F n }^ =1 such that clearly 

\F n+1 {x)-F n {x)\<2- n -\ 

Hence {F n }^ =1 converges uniformly to a continuous limit F called the 
Cantor-Lebesgue function (Figure 7). 6 By construction, F is increas¬ 
ing, F(0) = 0, F(l) = 1, and we see that F is constant on each interval 
of the complement of the Cantor set. Since m(C) = 0, we find that 
F f (x) = 0 almost everywhere, as desired. 


6 The reader may check that indeed this function agrees with the one given in Exercise 2 
of Chapter 1. 
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o 


Figure 7. The Cantor-Lebesgue function 


The considerations in this section, as well as this last example, show 
that the assumption of bounded variation guarantees the existence of a 
derivative almost everywhere, but not the validity of the formula 

F\x) dx = F(b)-F(a). 

In the next section, we shall present a condition on a function that will 
completely settle the problem of establishing the above identity. 

3.2 Absolutely continuous functions 

A function F defined on [a, b] is absolutely continuous if for any e > 0 
there exists 5 > 0 so that 

N N 

[ \F(b k ) - F(a k )\ < e whenever — ah) < 

k=l k=l 

and the intervals (a^, 6^), k = 1，…， iV are disjoint. Some general re¬ 
marks are in order. 

• From the definition, it is clear that absolutely continuous functions 
are continuous, and in fact uniformly continuous. 

• If F is absolutely continuous on a bounded interval, then it is also of 
bounded variation on the same interval. Moreover, as is easily seen, 
its total variation is continuous (in fact absolutely continuous). As 
a consequence the decomposition of such a function F into two 
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monotonic functions given in Section 3.1 shows that each of these 
functions is continuous. 

• If F(x) = J: f(y) dy where / is integrable, then F is absolutely 
continuous. This follows at once from (ii) in Proposition 1.12, 
Chapter 2. 

In fact, this last remark shows that absolute continuity is a necessary 
condition to impose on F if we hope to prove F r {x) dx = F(b) — F(a). 

Theorem 3.8 If F is absolutely continuous on [a, b], then F\x) exists 
almost everywhere. Moreover, if F\x) = 0 for a.e. x, then F is constant. 

Since an absolutely continuous function is the difference of two continu¬ 
ous monotonic functions, as we have seen above, the existence of F r {x) 
for a.e. x follows from what we have already proved. To prove that 
F f (x) = 0 a.e. implies F is constant requires a more elaborate version of 
the covering argument in Lemma 1.2. For the moment we revert to the 
generality of d dimensions to describe this. 

A collection B of balls {B} is said to be a Vitali covering of a set E 
if for every x E E and any rj > Q there is a ball B G B, such that x ^ B 
and m(B) < rj. Thus every point is covered by balls of arbitrarily small 
measure. 

Lemma 3.9 Suppose E is a set of finite measure and B is a Vitali cov¬ 
ering of E. For any 8 > 0 we can find finitely many balls Bi,in 
B that are disjoint and so that 

N 

m(Bi) > m(E) — 5. 

i=l 

Proof. We apply the elementary Lemma 1.2 iteratively, with the 
aim of exhausting the set E. It suffices to take 5 sufficiently small, say 
5 < m(E), and using the just cited covering lemma, we can find an initial 
collection of disjoint balls Bi ， B 2 , , Bn 1 in B such that m {Bi) > 

7 ^. (For simplicity of notation, we have written 7 = 3~ d .) Indeed, first 
we have m(E f ) > S for an appropriate compact subset E f of E. Because 
of the compactness of E f , we can cover it by finitely many balls from i?, 
and then the previous lemma allows us to select a disjoint sub-collection 
of these balls Bi, S 2 , • • •, Bn 1 such that ^ 2^=1 > ^m^E 1 ) > j5. 

With Si,..., as our initial sequence of balls, we consider two 
possibilities: either ^ rn(E) _ d and we are done with N = 
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TVi ； or, contrariwise, m(Bi) < m(E) — 5. In the second case, with 

E 2 = E — U=i 氏， we have m(E 2 ) > 5 (recall that m(Bi) = We 

then repeat the previous argument, by choosing a compact subset E’ 2 of 
E 2 with m(£^) > and by noting that the balls in B that are disjoint 
from |J=i Bi still cover E 2 and in fact give a Vitali covering for 五 2 , and 
hence for E’ 2 . Thus we can choose a finite disjoint collection of these 
balls Bi, A^i < z < A^ 2 , so that X^ati< 2 <at 2 ^ 7 ^- Therefore, now 

m(Bi) > 27 A and the balls Bi, l < i < iV 2 , are disjoint. 

We again consider two alternatives, whether or not ^ 

m(E) — 5. In the first case, we are done with N 2 = iV, and in the second 
case, we proceed as before. If, continuing this way, we had reached the 
A: th stage and not stopped before then, we would have selected a collection 
of disjoint balls with the sum of their measures > k^/5. In any case, our 
process achieves the desired goal by the k th stage if A: > (m(E) — 5)/^5, 
since in this case m (^i) ^ m(E) — 5. 

A simple consequence is the following. 


Corollary 3.10 We can arrange the choice of the balls so that 

N 

m(E — Bi) < 26. 

i=l 

In fact, let O be an open set, with (D 〕 E and m{0 — E) < 5. Since 
we are dealing with a Vitali covering of E, we can restrict all of our 
choices above to balls contained in O. If we do this, then (E — lj=i 氏 ） U 
Uili B i c where the union on the left-hand side is a disjoint union. 
Hence 


N N 

m(E — Bi) < m{0) — m([^J Bi) < m(E) 5 — (m(E) — 6) = 26. 

i=l 2=1 


We now return to the situation on the real line. To complete the proof 
of the theorem it suffices to show that under its hypotheses we have 
F(b) = F(a), since if that is proved, we can replace the interval [a, b] by 
any sub-interval. Now let E be the set of those x G (a, b) where F\x) 
exists and is zero. By our assumption m(E) = b — a. Next, momentarily 
fix e > 0. Since for each x E ： E we have 


lim 

/i—^0 


F(x + ") - F(x) 
h 


= 0 
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then for each rj > 0 we have an open interval / = (a x , b x ) C [a, 6] con¬ 
taining x, with 

\F(b x ) - F(a x )\ < e(b x - a x ) and b x - a x < rj. 

The collection of these intervals forms a Vitali covering of E, and 
hence by the lemma, for 5 > 0, we can select finitely many 1 < z < TV, 
Ii = (a“ bi), which are disjoint and such that 

N 

(9) rn(Ii) > m(E) — S = (b — a) — 5. 

i=l 

However, \F(bi) — F(ai)\ < e(bi — a^), and upon adding these inequalities 
we get 

N 

E 剛-則 I g e (6- a ), 

i=l 

since the intervals Ii are disjoint and lie in [a, b]. Next consider the 
complement of Ij in [a, 6]. It consists of finitely many closed in¬ 
tervals [j^f =1 [oik, /3k] with total length < 6 because of (9). Thus by the 
absolute continuity of F (if 5 is chosen appropriately in terms of e), 
Y,k=i \ F (^k) - F((Xk)\ < e. Altogether, then, 

N M 

\F(b) - F(a)\ < J2 刚 - + 剛） - H»k)\<e(b - a) + e. 

1=1 k=l 

Since e was positive but otherwise arbitrary, we conclude that F(b) — 
F(a) = 0, which we set out to show. 

The culmination of all our efforts is contained in the next theorem. In 
particular, it resolves our second problem of establishing the reciprocity 
between differentiation and integration. 

Theorem 3.11 Suppose F is absolutely continuous on [a, b]. Then F f 
exists almost everywhere and is integrable. Moreover, 

F{x) — F(a) = f F f (y) dy, for all a < x < b. 

J a 

By selecting x = b we get F(b) — F(a) = J F\y) dy. 

Conversely, if f is integrable on [a, b], then there exists an absolutely 
continuous function F such that F\x) = f(x) almost everywhere, and in 
fact, we may take F(x) = J: f(y) dy. 
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Proof. Since we know that a real-valued absolutely continuous 
function is the difference of two continuous increasing functions, Corol¬ 
lary 3.7 shows that F' is integrable on [a, b]. Now let G(x) = f a F f (y) dy. 
Then G is absolutely continuous; hence so is the difference G(x) — F(x). 
By the Lebesgue differentiation theorem (Theorem 1.4), we know that 
G\x) = F\x) for a.e. x\ hence the difference F — G has derivative 0 al¬ 
most everywhere. By the previous theorem we conclude that F — G is 
constant, and evaluating this expression x = a gives the desired result. 

The converse is a consequence of the observation we made earlier, 
namely that f a f(y) dy is absolutely continuous, and the Lebesgue dif¬ 
ferentiation theorem, which gives F f {x) = f(x) almost everywhere. 


3.3 Differentiability of jump functions 

We now examine monotonic functions that are not assumed to be con¬ 
tinuous. The resulting analysis will allow us to remove the continuity 
assumption made earlier in the proof of Theorem 3.4. 

As before, we may assume that F is increasing and bounded. In par¬ 
ticular, these two conditions guarantee that the limits 

F{x~) = lim F(y) and F{x + ) = lim F(y) 

y — y x y ― x 

■y <C. x y x 

exist. Then of course F(x~) < F{pc) < F(x + ), and the function F is 
continuous at x if F(x~) = F(x + ); otherwise, we say that it has a jump 
discontinuity. Fortunately, dealing with these discontinuities is manage¬ 
able, since there can only be countably many of them. 

Lemma 3.12 A bounded increasing function F on [a, b] has at most 
countably many discontinuities. 

Proof. If F is discontinuous at x, we may choose a rational number 
r x so that F(x~) < r x < F(x + ). If / is discontinuous at x and z with 
x < z, we must have F(x + ) < F(z~), hence r x < r z . Consequently, to 
each rational number corresponds at most one discontinuity of F, hence 
F can have at most a countable number of discontinuities. 

Now let i x n}n=l denote the points where F is discontinuous, and let 
a n denote the jump of F at x n , that is, a n = F(x^) — F(x~). Then 

= F (工 n ) + 

and 

F(x n ) = F(x~) + 6 n a n , for some 6> n , with 0 < 0 n < 1. 
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If we let 

{ 0 if a; < x n , 

❺ n X = X n ， 

1 if z > x n , 

then we define the jump function associated to F by 

oo 

Jf{x^ = 〉二 Q ： njn(^). 
n=l 

For simplicity, and when no confusion is possible, we shall write J instead 
oi Jp. 

Our first observation is that if F is bounded, then we must have 

oo 

< F(b) - F(a) < oo, 

n=l 


and hence the series defining J converges absolutely and uniformly. 


Lemma 3.13 If F is increasing and bounded on [a, b], then: 

(i) J(x) is discontinuous precisely at the points {x n } and has a jump 
at x n equal to that of F. 


(ii) The difference F(x) — J(x) is increasing and continuous. 


Proof. If x ^ x n for all n, each j n is continuous at x, and since the 
series converges uniformly, J must be continuous at x. If x = xn for 
some N, then we write 


N oo 

J{x) = ^2 a rdn(x) + ^2 Ol n j n {x). 

n=l n=iV+l 

By the same argument as above, the series on the right-hand side is 
continuous at x. Clearly, the finite sum has a jump discontinuity at 
of size ajsi. 

For (ii), we note that (i) implies at once that F — J is continuous. 
Finally, if y > x we have 

J{y) - J{x) < ^2 a n< F{y) - F(x), 

x<x n <y 
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where the last inequality follows since F is increasing. Hence 
F(x) - J{x) < F(y)- J(y), 
and the difference F — J is increasing, as desired. 

Since we may write F(x) = [F(x) — J(x)] + J(x), our final task is to 
prove that J is differentiable almost everywhere. 

Theorem 3.14 If J is the jump function considered above, then J f {x) 
exists and vanishes almost everywhere. 

Proof. Given any e 〉 0, we note that the set E of those x where 


( 10 ) 


J(x -\-h) - J(x) 

丄 lmsup ---〉 e 

h^o h 


is a measurable set. (The proof of this little fact is outlined in Exercise 14 
below.) Suppose 8 = m(E). We need to show that 5 = 0. Now observe 
that since the series ^2 a n arising in the definition of J converges, then for 
any ", to be chosen later, we can find an N so large that ^ n>N ol u < rj. 
We then write 

^o(^) = 〉: 5 

n>N 


and because of our choice of N we have 
(11) J 0 (b) - J 0 (a) < rj. 

However, J — Jq is a finite sum of terms a n j n (x), and therefore the set 
of points where (10) holds, with J replaced by J 0 , differs from E by 
at most a finite set, the points {xi, o ； 2 , • • •, Thus we can find a 

compact set K, with m(K) > 5/2, so that limsuph—_ 0 J o( x + h y~Jo(x) > 亡 
for each x E K. Hence there are intervals (a x , b x ) containing x, x G K, so 
that Jo(b x ) — Jo{a x ) > e(b x — a x ). We can first choose a finite collection 
of these intervals that covers K, and then apply Lemma 1.2 to select 
intervals ii, … ，/ n which are disjoint, and for which X^ =1 m (Ij) ^ 
m(K)/3. The intervals Ij = of course satisfy 

Jo(~) _ Jo{ a j) > — aj). 


Now, 


N 


Jo(b) — Jo(cl) > Jo(bj) — Jo( a j) > — %) > -m(K) > -5. 
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Thus by (11), e5/6 < ry, and since we are free to choose 77 , it follows that 
5 = 0 and the theorem is proved. 

4 Rectifiable curves and the isoperimetric inequality 

We turn to the further study of rectifiable curves and take up first the 
validity of the formula 



(x'(t) 2 +y'(t) 2 ) 1/2 dt, 


( 12 ) 


L = 


for the length L of the curve parametrized by (x(t), y(t)). 

We have already seen that rectifiable curves are precisely the curves 
where, besides the assumed continuity of x(t) and y(t), these functions 
are of bounded variation. However a simple example shows that for¬ 
mula (12) does not always hold in this context. Indeed, let x(t) = F(t) 
and y(t) = F(t), where F is the Cantor-Lebesgue function and 0 <t < 1. 
Then this parametrized curve traces out the straight line from (0,0) to 
( 1 , 1 ) and has length \/ 2 , yet x f (t) = y r {t) = 0 for a.e. t. 

The integral formula expressing the length of L is in fact valid if we 
assume in addition that the coordinate functions of the parametrization 
are absolutely continuous. 

Theorem 4.1 Suppose [x(t),y(t)) is a curve defined for a < t < b. If 
both x(t) and y(t) are absolutely continuous, then the curve is rectifiable, 
and if L denotes its length, we have 



Note that if F{t) = x{t) + iy(t) is absolutely continuous then it is auto¬ 
matically of bounded variation, and hence the curve is rectifiable. The 
identity ( 12 ) is an immediate consequence of the proposition below, which 
can be viewed as a more precise version of Corollary 3.7 for absolutely 
continuous functions. 

Proposition 4.2 Suppose F is complex-valued and absolutely continu¬ 
ous on [a, b]. Then 
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In fact, because of Theorem 3.11, for any partition a = to < ti 〈… < 
tjsf = b of [a, 6], we have 


N 


N 


\ F ^j) ~ 卩 ( 匕 -1)1=x] 


F\t) dt 



So this proves 
(13) 


T F (a,b)< / IF ， ⑴ I eft. 


To prove the reverse inequality, fix e > 0, and using Theorem 2.4 in 
Chapter 2 find a step function g on [a, 6], such that F f = g h with 
\h(t) \ dt < e. Set G(x) = g(t) dt^ and H(x) = h(t) dt. Then F = 
G H : and as is easily seen 


T F (a,6) > T G (a ， b) - T H (a ， b). 

However, by (13) T 丑 (a, 6) < e, so that 

T F {a,b) > T G (a ， b) _ e. 

Now partition the interval [a, 6], as a = ^ < •. • < = b, so that the step 

function g is constant on each of the intervals j = 1,2, … ， iV. 

Then 


N 


TG ( a ， b ) > ^2 ⑹ - 邱_7-1)1 


N 

E 


r»tj 


g(t) dt 



tj _ 1 

tj 


\g(t)\dt 


t 0- 


\9ii)\dt. 
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Tp(a,6)> / \F\t)\dt-2e, 


and letting e ^ 0 we establish the assertion and also the theorem. 

Now, any curve (viewed as the image of a mapping 1 1 —> z(t)) can in 
fact be realized by many different parametrizations. A rectifiable curve, 
however, has associated to it a unique natural parametrization, the arc- 
length parametrization. Indeed, let L(A, B) denote the length function 
(considered in Section 3.1), and for the variable t in [a, b] set s = s(t)= 
L(a, t). Then s(t), the arc-length, is a continuous increasing function 
which maps [a, b] to [0, L], where L is the length of the curve. The arc- 
length parametrization of the curve is now given by the pair z(s)= 
x(s) + iy(s), where 5(5) = z(t), for s = s(t). Notice that in this way the 
function z(s) is well defined on [0, L], since if s(ti) = 5 (^ 2 )? ti < 亡 2 , then 
in fact z(t) does not vary in the interval [^ 1 ,^ 2 ] and thus z{ti) = 2 :(^ 2 )- 
Moreover \z(si) — z(s 2 )\ < |si — S 2 I, for all pairs 5i,s 2 G [0,L], since the 
left-hand side of the inequality is the distance between two points on the 
curve, while the right-hand side is the length of the portion of the curve 
joining these two points. Also, as s varies from 0 to L, traces out 
the same points (in the same order) that z(t) does as t varies from a to b. 

Theorem 4.3 Suppose (x(f), y{t)), a < t < b, is a rectifiable curve that 
has length L. Consider the arc-length parametrization z{s) = (^( 5 ), y(s)) 
described above. Then x and y are absolutely continuous, |S’(s)| = 1 for 
almost every s G [0, L], and 



Proof. We noted that \z(si) — z(s 2 )\ < |si — 52 I ， so it follows im¬ 
mediately that 5(5) is absolutely continuous, hence differentiable almost 
everywhere. Moreover, this inequality also proves that |f(5)| 幺 1， for 
almost every s. By definition the total variation of 5 equals L, and by 



that this identity is possible only when |5’(5)| = 1 almost everywhere. 


4.1* Minkowski content of a curve 

The proof we give below of the isoperimetric inequality depends in a key 
way on the concept of the Minkowski content. While the idea of this 


4. Rectifiable curves and the isoperimetric inequality 


137 


content has an interest on its own right, it is particularly relevant for us 
here. This is because the rectifiability of a curve is tantamount to having 
(finite) Minkowski content, with that quantity the same as the length of 
the curve. 

We begin our discussion of these matters with several definitions. A 
curve parametrized by z(t) = (x(t),y(t)), a < t < b, is said to be simple 
if the mapping 1 1 —> z{t) is injective for t G [a, 6]. It is a closed simple 
curve if the mapping 1 1 —> z(t) is injective for t in [a, 6), and z(a) = z(b). 
More generally, a curve is quasi-simple if the mapping is injective for t 
in the complement of finitely many points in [a, b]. 



Figure 8 . A quasi-simple curve 


We shall find it convenient to designate by T the pointset traced out by 
the curve z{t) as t varies in [a, 6], that is, T = {z(t) : a <t < b}. For any 
compact set K CM 2 (we take K = T below), we denote by K 5 the open 
set that consists of all points at distance (strictly) less than 6 from K, 


K 5 = {xeR 2 :d(x,K) <5}. 



Figure 9. The curve T and the set V s 
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We then say that the set K has Minkowski content 7 if the limit 


lim 


m(K s ) 

26 


exists. When this limit exists, we denote it by A4(K). 

Theorem 4.4 Suppose T = a < t < b} is a quasi-simple curve. The 
Minkowski content of T exists if and only if T is rectifiable. When this 
is the case and L is the length of the curve, then *M(r) = L. 

To prove the theorem, we also consider for any compact set K 

M*(K) ^ lim sup h 。 and M,(K) = lim inf D 勺 
^ — 2o 5 ― 2o 

(both taken as extended positive numbers). Of course if) < 

To say that the Minkowski content exists is the same as saying that 
M*(K) < oo and M^(K) = Their common value is then M(K). 

The theorem just stated is the consequence of two propositions con¬ 
cerning and The first is as follows. 

Proposition 4.5 Suppose T = {z{t),a <t<b}isa quasi-simple curve. 
If < oo, then the curve is rectifiable, and if L denotes its length, 

then 


L<M^(T). 


The proof depends on the following simple observation. 

Lemma 4.6 IfT = {z(t),a <t<b}is any curve, and A = |z(6) — z(a)\ 
is the distance between its end-points, then m(T s ) > 26A. 

Proof. Since the distance function and the Lebesgue measure are 
invariant under translations and rotations (see Section 3 in Chapter 1 
and Problem 4 in Chapter 2) we may transform the situation by an 
appropriate composition of these motions. Therefore we may assume 
that the end-points of the curve have been placed on the x-axis, and 
thus we may suppose that z(a) = (A, 0), z{b) = (B, 0) with A < B, and 
A = B — A (in the case A = B the conclusion is automatically verified). 

By the continuity of the function x{t) : there is for each x in [A, B] a 
value t in [a, 6], such that x = x(t). Since Q = (x(?), y(t)) G T, the set 


r This is one-dimensional Minkowski content; variants are in Exercise 28 and also in 
Chapter 7 below. 






4. Rectifiable curves and the isoperimetric inequality 


139 


F s contains a segment parallel to the y-axis, of length 2S centered at Q 
lying above x (see Figure 10). In other words the slice (T s ) x contains 
the interval (y(t) — S,y(t) + 5), and hence mi((T 5 ) x ) > 26 (where mi is 
the one-dimensional Lebesgue measure). However by Fubini’s theorem 

m(r 5 ) = [ m 认 (r 5 ) x )dxt f m 1 ((r 5 ) a: ) dx > 25(B - A) ^ 2SA, 

JR J A 

and the lemma is proved. 



Figure 10. The situation in Lemma 4.6 


We now pass to the proof of the proposition. Let us assume first that 
the curve is simple. Let P be any partition a = to < ti < ... < tN = b 
of the interval [a, 6], and let Lp denote the length of the corresponding 
polygonal line, that is, 


N 

Lp = 〉: \ z i^j) ~ 2： (~-i)l. 
j=i 

For each e > 0, the continuity of t >—>■ z{t) guarantees the existence of N 
proper closed sub-intervals Ij = [dj^bj] of so that 

N 

^2\z(bj) - z(aj)\ >L P -e. 
j=i 

Let Fj denote the segment of the curve given by Tj = {z ⑴; t G Ij}. Since 
the closed intervals /i,..., are disjoint, it follows by the simplicity of 
the curve that the compact sets 1\, 厂2， ... ，1 ^ are disjoint. However, 
r D A and T 5 D (J^ 1 (r j ) <5 . Moreover, the disjoint ness of the Tj 
implies that the sets (r^)^ are also disjoint for sufficiently small 5. Hence 
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for those 5, the previous lemma applied to each Tj gives 
N 

m(T s )>Y,m((T j ) s )>25Y / \z(bj) - z{aj)\. 
j=i 

As a result, m(T s )/(25) > Lp — e, and a passage to the limit gives 
A^*(r) > Lp — e. Since this inequality is true for all partitions P and 
all e > 0, it implies that the curve is rectifiable and its length does not 
exceed »A/(*(r). 

The proof when the curve is merely quasi-simple is similar, except 
the partitions P considered must be refined so as to include as partition 
points those (finitely many) points in whose complement (in [a, b]) the 
mapping 1 1 —> z{t) is injective. The details may be left to the reader. 

The second proposition is in the reverse direction. 

Proposition 4.7 Suppose T = {z{t), a < t < b} is a rectifiable curve with 
length L. Then 

A4*(r) < L. 


The quantities A^*(r) and L are of course independent of the parametriza- 
tion used; since the curve is rectifiable, it will be convenient to use the arc- 
length parametrization. Thus we write the curve as z(s) = (x(s), y(s)), 
with 0 < 5 < L, and recall that then z(s) is absolutely continuous and 
»)| = 1 for a.e. s G [0, L\. 

We first fix any 0 < e < 1, and find a measurable set E e CM. and a 
positive number r e such that m(E e ) < e and 


(14) sup 

0<|/i|<r e 


z(s + h) — z(s) 


h 


-z'(s) 


< e for all s G [0, L] — E e . 


Indeed, for each integer n, let 


F n (s) = sup 

0<\h\<l/n 


z(s + h) — z(s) 


h 


z’(s) 


(where z(s) has been extended outside [0, L], so that z(s) = z(0), when 
5 < 0, and z(s) = z(L) when s > L). Because z(s) is continuous the 
supremum of h in the definition of F n (s) can be replaced by a supremum 
of countably many measurable functions, and hence each F n is measur¬ 
able. However, F n (s) —)• 0, as n —^ oc for a.e s G [a, b\. Thus by Egorov^ 
theorem the convergence is uniform outside a set E e with m(E e ) < e, 
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and so we merely need to choose r e = 1/n for sufficiently large n to es¬ 
tablish (14). It will be convenient in what follows to assume, as we may, 
that z^s) exists and | 2 ： ’(<s)| = 1 for every s ^ E e . 

Now for any 0 < p < r e (with p < 1), we partition the interval [0, L] 
into consecutive closed intervals, each of length p, (except that the last 
interval may have length < p). Then there is a total oi N < L/p-\-1 such 
intervals that arise. We call these intervals . • ， In, and divide them 

into two classes. The first class, those intervals Ij we call “good,” are the 
ones that enjoy the property that Ij (jt. E e . The second class, those which 
are “bad,” have the property that Ij C E e . As a result, C E e , 

hence the union has measure < e. 

We have of course that [0,L] C U 二 it, an d if we denote by Fj the 
segment of T given by {z(s) : 5 G Ij}, then T = IJ^Li and as a result 

r 5 = Uf=i(ri) 5 and m(r 5 ) < zU ™(( r i) 5 )- 

We consider first the contribution of m((Fj)^) when Ij is a good in¬ 
terval. Recall that for such Ij = [a^, bj] there is an 5q G Ij which is not 
in E e , and therefore (14) holds for 5 = 5 q. Let us now visualize by in¬ 
troducing a coordinate system such that z(sq) = 0 and ^(sq) = 1 (which 
we may assume after a suitable translation and rotation). We maintain 
the notations z(5) and for the so transformed segment of the curve. 



Figure 11. Estimate of m((Tj) s ) for a good interval Ij 

Note that as h varies over the interval [aj — sq , bj — 5 q ], sq h varies 
over Ij = [aj, bj]. Therefore Tj is contained in the rectangle 

[ a j — Sq — ep, bj — sq + ep] x [—ep, ep], 
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since \h\ < p < r e by construction, and |^(so h) — h\ < e\h\ by (14). See 
Figure 11. Thus (Tj) s is contained in the rectangle 

[dj — sq — ep — 5, bj — so + ep + <5] x [—ep — 5, ep 5], 

which has measure < (p + 2ep + 25)(2ep + 25). Therefore, since e < 1, 
we have 

(15) m((r^) (5 ) < 25p+ 0(e8p + 5 2 + ep 2 ), 

where the bound arising in O is independent of e, and p. This is our 
desired estimate for the good intervals. 

To pass to the remaining intervals we use the fact that | 之 (s) — :(s , )| 幺 
丨5 — 5’| for all 5 and s’. Thus in every case r) is contained in a ball 
(disc) of radius p, and hence (Tj) s is contained in a ball of radius p-\- 5. 
Therefore we have the crude estimate 

( 16 ) 


We now sum (15) over the good intervals (of which there are at most 
L/p + 1), and (16) over the bad intervals. There are at most e/p + 1 
of the latter kind, since their union is included in E e and this set has 
measure < e. Altogether, then, 

77i(r^) ^ 25L + 25p + 0{c5 + / p + cp) + O ((e/p + 1)(5 2 + p 2 )), 


which simplifies to the inequalities 


m(T s ) 

26 


<L + o("p + 6+- + ^ + - + 5+^ 
V P ^ P o y 

- i+ °( p+e+ p + f + t) j 


where in the last line we have used the fact that e < 1 and p < 1. In 
order to obtain a favorable estimate from this as 5—^0, we need to 
choose p (the length of the sub-intervals) very roughly of the same size 
as 8. An effective choice is p = 6/e 1 ^ 2 . If we fix this choice and restrict 
our attention to 5 for which 0 < 5 < e 1//2 r e , then automatically p < r e , 
as required by (14). Inserting p = (5/e 1 / 2 in the above inequality gives 


m(r 5 ) 

25 


< L + O 




+ 6 + 6 ^ 
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and thus 

limsup <L + 0(e + e 1/2 ). 

s^o 2o 

Now we can let e —»• 0 to obtain the desired conclusion M^(T) < L, and 
the proofs of the proposition and theorem are complete. 


4.2* Isoperimetric inequality 

The isoperimetric inequality in the plane states, in effect, that among all 
curves of a given length it is the circle that encloses the maximum area. 
A simple form of this theorem already appeared in Book I. While the 
proof given there had the virtue of being brief and elegant, it did suffer 
several shortcomings. Among them the “area” in the statement was 
defined indirectly via a technical artifice, and the scope of the conclusion 
was limited because only relatively smooth curves were considered. Here 
we want to remedy those defects and deal with a general version of the 
result. 

We suppose that f] is a bounded open subset of R 2 , and that its bound¬ 
ary — f], is a rectifiable curve T, with length i(T). We do not require 
that r be a simple closed curve. The isoperimetric theorem then asserts 
the following. 

Theorem 4.8 47rm(f2) < £(T) 2 . 

Proof. For each (5 > 0 we consider the outer set 
Q+(5) = {x G M 2 : d{pc^ Q) < 5}, 

and the inner set 

Q_(5) = {x G M 2 : d(x, Q c ) > 5}. 

Thus C C ⑷. 

We notice that for r 5 = {x : d(x, T) < 5} we have 

(17) o+(5) = f]_(5)ur (5 , 


and that this union is disjoint. Moreover, if D(5) is the open ball (disc) 
of radius 5 centered at the origin, D(5) = {x E R 2 , \x\ < 5}, then clearly 


( 18 ) 


f Q + (6) D Q-hD(6), 

{ Q D Q-(S) + D(S). 
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We now apply the Brunn-Minkowski inequality (Theorem 5.1 in Chap¬ 
ter 1) to the first inclusion, and obtain 

m(f2+(5)) > (m(fi) 1 / 2 + m(_D ⑷ )" 2 ) 2 . 

Since m(D(5)) = 7r5 2 (this standard formula is established in Exercise 14 
in the previous chapter), and (A + B) 2 > A 2 -\- 2AB whenever A and B 
are positive, we find that 

m(0 + (5)) > m(f2) + 27r 1//2 5m(f2) 1//2 . 

Similarly, m(Jl) > + 27r 1 / 2 5m(f]_(5)) 1 / 2 using the second in¬ 

clusion in (18), which implies 

—m(Q-(5)) > —m(Q) + 2n 1 ^ 2 6 m ㈨-⑷ ) 1/2 . 

Now by (17) 

m(T 5 ) = m(Jl+ ⑷ )— m(Q-(S)), 

and by the inequalities above, we have 

m(r 5 ) > 27r 1//2 S(m(Q) 1//2 + m(Jl_ ⑷ )" 2 ). 

We now divide both sides by 25 and take the limsup as 5 — > 0. This 
yields 


m ⑼ 1/2 ) 
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since as 5 — • 0. However, by Proposition 4.7, £(T) > A / (*(r), 

so 

^(r) > 2n 1/2 m(n) 1/2 , 


which proves the theorem. 

Remark. A similar result holds even without the assumption that the 
boundary is a (rectifiable) curve. In fact the proof shows that for any 
bounded open set Q whose boundary is T we have 

4 謂⑼ <M*(r) 2 . 

5 Exercises 

1. Suppose (p is an integrable function on R d with f Rd (p(x) dx = 1. Set Ks(x)= 
8~ d (f(x/5), ^ > 0. 


(a) Prove that is a family of good kernels. 


(b) Assume in addition that (p is bounded and supported in a bounded set. 
Verify that {i^ 5 }( 5 >o is an approximation to the identity. 


(c) Show that Theorem 2.3 (convergence in the i^-norm) holds for good kernels 
as well. 


2. Suppose {Ks} is a family of kernels that satisfies: 

(i) \K s (x)\ < A8~ d for all S > 0. 

(ii) \K s (x)\ < A8/\x\ d+1 for all 8 > 0. 

(iii) / 二 ^ Ks(x) dx = 0 for all <5 > 0. 

Thus Ks satisfies conditions (i) and (ii) of approximations to the identity, but the 
average value of Ks is 0 instead of 1. Show that if / is integrable on then 

(/ * Ks)(x) —> 0 for a.e. a;, as <5 ^ 0. 


3. Suppose 0 is a point of (Lebesgue) density of the set 五 C M. Show that for each 
of the individual conditions below there is an infinite sequence of points x n G E, 
with x n 0, and > 0 as n —>• oo. 

(a) The sequence also satisfies —x n € E for all n. 

(b) In addition, 2x n belongs to E for all n. 
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Generalize. 


4. Prove that if f is integrable on R d , and / is not identically zero, then 


f(x) > 



for some c > 0 and all |x| > 1. 


Conclude that f* is not integrable on IR d . Then, show that the weak type estimate 

m({x : f*(x) > a}) < c/a 

for all a > 0 whenever f \f\ = 1, is best possible in the following sense: if / is 
supported in the unit ball with J \f\ = 1, then 

m({x : f*(x) > a}) > c /a 

for some c f > 0 and all sufficiently small a. 

[Hint: For the first part, use the fact that J B \f\ > 0 for some ball B.] 


5. Consider the function on R defined by 


1 


f(x) = < |a ； |(logl/|a:|) 2 
0 


if |x| < 1/2, 

otherwise. 


(a) Verify that / is integrable. 

(b) Establish the inequality 

f*(x) > -r—r-r. ― °^ n .. for some c > 0 and all \x\ < 1/2, 

kl(logl/kl) II —/， 


to conclude that the maximal function f* is not locally integrable. 


6. In one dimension there is a version of the basic inequality (1) for the maximal 
function in the form of an identity. We define the “one-sided” maximal function 

i rx+h 

f+( x ) = su p r / \f(y)\ d v- 

h>0 n Jx 

If Ea = {x ER : /; ㈤ > a}, then 

m(Ei) = -f \f(y)\dy. 
a Je+ 

[Hint: Apply Lemma 3.5 to F(x) = |/(y)| dy — ax. Then is the union of 

disjoint intervals (afc, bk) with f a ^ \f(y)\ dy = a{a^ — 6^).] 
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7. Using Corollary 1.5, prove that if a measurable subset E of [0,1] satisfies 
m(E fl /) > am(I) for some a > 0 and all intervals I in [0,1], then E has measure 
1. See also Exercise 28 in Chapter 1. 


8. Suppose A is a Lebesgue measurable set in R with m(A) > 0. Does there exist 

a sequence {s n }S=i such that the complement of + in IR has measure 

zero? 

[Hint: For every e > 0, find an interval I e of length i e such that m(A fl I e ) > 
(1 — e)m(/ € ). Consider |JfcL_oo(^ + tk), with tk = k£ e . Then vary e.] 

9. Let i 7 be a closed subset in IR, and (5(a:) the distance from x to F, that is, 

= d(x, F) = inf{|x — y\ ： y E F}. 

Clearly, ^(a: + y) < \y\ whenever x E F. Prove the more refined estimate 

5(x -\- y) = o(\y\) for a.e. x G F, 

that is, 8(x y)/\y\ 0 for a.e. x ^ F. 

[Hint: Assume that x is a point of density of F.] 

10. Construct an increasing function on R whose set of discontinuities is pre¬ 
cisely Q. 

11. If a, 6 > 0, let 

" 、 f x a sin(a: _b ) for 0 < a: < 1, 

/(x) = { 0 if. = o. - 

Prove that / is of bounded variation in [0,1] if and only if a > 6. Then, by tak¬ 
ing a = b, construct (for each 0 < a < 1) a function that satisfies the Lipschitz 
condition of exponent a 

\f(x) - f(y)\ < A\x-y\ a 
but which is not of bounded variation. 

[Hint: Note that if /i > 0, the difference \f(x h) — f(x)\ can be estimated by 
C(x + /i) a , or C r h/x by the mean value theorem. Then, consider two cases, 
whether x a+1 > h or x a+1 < h. What is the relationship between a and a?] 

12. Consider the function F(x) = x 2 sin(l/a: 2 ), x ^ 0, with F(0) = 0. Show that 
F\x) exists for every x, but F' is not integrable on [—1,1]. 


13. Show directly from the definition that the Cantor-Lebesgue function is not 
absolutely continuous. 


14. The following measurability issues arose in the discussion of differentiability 
of functions. 
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(a) Suppose F is continuous on [a, b]. Show that 

D+(F)(x) = limsup F< ' X + h \ ~ F ( X ) 
h 0 h 


is measurable. 

(b) Suppose J{x) = OL n jn(x) is a jump function as in Section 3.3. Show 

that 


lim sup 
o 


J{x -\- h) — J{x) 
h 


is measurable. 


[Hint: For (a), the continuity of F allows one to restrict to countably many h in tak¬ 


ing the limsup. For (b), given k > m, let F^ m 

where Jn(x) = Yln-i ⑷. Note that each 
sively, let N —>• oo, k —> oo, and finally m — oo 


J N (x+h) — J N (x) 

h ^ 


=I 
F^ m is measurable. Then, succes- 

•] " 


15. Suppose F is of bounded variation and continuous. Prove that F = F\ — F 2 , 
where both F± and F 2 are monotonic and continuous. 


16. Show that if F is of bounded variation in [a, b ], then: 

(a) f^\F\x) \dx < T F (a, b). 

(b) J: \F\x) \ dx = Tf{cl-, b) if and only if F is absolutely continuous. 

As a result of (b), the formula L = \z'(t)\ dt for the length of a rectifiable curve 

parametrized by 2 : holds if and only if 2 ： is absolutely continuous. 


17. Prove that if {K e } e >o is a family of approximations to the identity, then 

sup |(/ * K e )(x)\ < cf*(x) 

e>0 


for some constant c > 0 and all integrable /. 


18. Verify the agreement between the two definitions given for the Cantor-Lebesgue 
function in Exercise 2, Chapter 1 and in Section 3.1 of this chapter. 


19. Show that if / : IR ^ R is absolutely continuous, then 

(a) f maps sets of measure zero to sets of measure zero. 

(b) f maps measurable sets to measurable sets. 
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20. This exercise deals with functions F that are absolutely continuous on [a, b] 
and are increasing. Let A = F(a) and B = F(b). 

(a) There exists such an F that is in addition strictly increasing, but such that 
F r {x) = 0 on a set of positive measure. 

(b) The F in (a) can be chosen so that there is a measurable subset E C [A, B], 
m(E) = 0, so that F _1 (E) is not measurable. 

(c) Prove, however, that for any increasing absolutely continuous F, and E a 
measurable subset of [A, B], the set F~ 1 (E) fl {F\x) > 0} is measurable. 

[Hint: (a) Let F(x) = f: Xk(x) dx^ where K is the complement of a Cantor-like 
set C of positive measure. For (b), note that F(C) is a set of measure zero. Finally, 
for (c) prove first that m(0) = f F _i( 0 ) F’(x) dx for any open set O.] 

21. Let F be absolutely continuous and increasing on [a, b] with F(a) = A and 
F(b) = B. Suppose / is any measurable function on [A, B]. 

(a) Show that f(F(x))F f (x) is measurable on [a, b]. Note: f(F(x)) need not be 
measurable by Exercise 20 (b). 

(b) Prove the change of variable formula: If / is integrable on [A, B], then so is 
f(F(x))F\x), and 




f{F{x))F\x)dx. 


[Hint: Start with the identity m(0) = f F _i( 0 ) ^( x ) dx used in (c) of Exercise 20 
above.] 


22. Suppose that F and G are absolutely continuous on [a, b]. Show that their 
product FG is also absolutely continuous. This has the following consequences. 

(a) Whenever F and G are absolutely continuous in [a, 6], 


F r {x)G{x) dx =— 


F(x)G r {x) dx + [F(x)G(x)]a. 


(b) Let F be absolutely continuous in [—7r,7r] with ^(Tr) = F(—n). Show that 
if 


Oju 


2n 



F{x)e~ inx dx, 


such that F(x) ~ ^ a n e inx , then 

F r {x) ~ ina n e inx . 
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(c) What happens if F(—tv) ^ F(n)? [Hint: Consider F(x) = x.] 


23. Let F be continuous on [a, b]. Show the following. 

(a) Suppose (D + F)(x) > 0 for every x G [a, b]. Then F is increasing on [a, b]. 

(b) If F r (x) exists for every x G (a, b) and \F\x)\ < M, then \F(x) — F(y)\ < 
M\x — y\ and F is absolutely continuous. 

[Hint: For (a) it suffices to show that F{b) — F(a) > 0. Assume otherwise. Hence 
with G e {x) = F(x) — F(a) + e(x — a), for sufficiently small e > 0 we have G e (a)= 
0, but G e (b) < 0. Now let xo G [a, b) be the greatest value of xo such that G e (xo) > 
0. However, (D + G e )(xo) > 0.] 

24. Suppose F is an increasing function on [a, b]. 

(a) Prove that we can write 


F = Fa + Fc + Fj 


where each of the functions Fa, Fc, and Fj is increasing and: 

(i) Fa is absolutely continuous. 

(ii) Fc is continuous, but F' c (pc) = 0 for a.e. x. 

(iii) Fj is a jump function. 

(b) Moreover, each component Fa, Fc, Fj is uniquely determined up to an 
additive constant. 

The above is the Lebesgue decomposition of F. There is a corresponding 
decomposition for any F of bounded variation. 

25. The following shows the necessity of allowing for general exceptional sets of 
measure zero in the differentiation Theorems 1.4, 3.4, and 3.11. Let E be any set 
of measure zero in IR d . Show that: 

(a) There exists a non-negative integrable / in IR d , so that 




(b) When d = 1 this may be restated as follows. There is an increasing abso¬ 
lutely continuous function F so that 


= D-(F)(x) = oo, 


for each x E E. 
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[Hint: Find open sets O n D E, with m(O n ) < 2 _n , and let f(x) = \o n (^)-] 

26. An alternative way of defining the exterior measure m*(E) of an arbitrary set 
E, as given in Section 2 of Chapter 1, is to replace the coverings of E by cubes 
with coverings by balls. That is, suppose we define m^(E) as inf m(Bj), 
where the infimum is taken over all coverings E C Bj by open balls. Then 
m*(E) = (E). (Observe that this result leads to an alternate proof that the 

Lebesgue measure is invariant under rotations.) 

Clearly m*(E) < mf(E). Prove the reverse inequality by showing the follow¬ 
ing. For any e > 0, there is a collection of balls {Bj} such that E C (Jj Bj while 
m(Bj) < m^(E) + e. Note also that for any preassigned <5, we can choose the 
balls to have diameter < S. 

[Hint: Assume first that E is measurable, and pick O open so that O ] E and 
m(G — E) < e . Next, using Corollary 3.10, find balls B±,..., Bn such that 
yZf-i m(Bj) < m(E) + 2e and m(E — U^Li 巧） S 3e’. Finally, cover E — Uf =1 Bj 
by a union of cubes, the sum of whose measures is < 4e ’， and replace these cubes 
by balls that contain them. For the general E, begin by applying the above when 
is a cube.] 


27. A rectifiable curve has a tangent line at almost all points of the curve. Make 
this statement precise. 

28. A curve in R d is a continuous map 1 1 -^- z(t) of an interval [a, b] into IR d . 

(a) State and prove the analogues of the conditions dealing with the rectifiability 
of curves and their length that are given in Theorems 3.1, 4.1, and 4.3. 

(b) Define the (one-dimensional) Minkowski content A4(K) of a compact set in 

as the limit (if it exists) of 

m(K s ) s n 

- ... as d — 0 ， 

where rrid-i(B(5)) is the measure (in IR d_1 ) of the ball defined by B(5) = 
{a: G R d_1 , \x\ < ^}. State and prove analogues of Propositions 4.5 and 4.7 
for curves in IR d . 


29. Let r = {z(t), a < t < b} be a curve, and suppose it satisfies a Lipschitz 
condition with exponent a, 1/2 < a < 1, that is, 

\z(t) — z{t')\ < A\t — for all t, t' G [a, b]. 

Show that m(T 5 ) = 0(S 2 ~ 1 ^ a ) for 0 < (5 < 1. 

30. A bounded function F is said to be of bounded variation on R if F is of 
bounded variation on any finite sub-interval [a, b], and sup a b T_p(a, &) < oo. 

Prove that such an F enjoys the following two properties: 
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(a) f R \F(x + /i) — F(x)| dx < A\h\, for some constant A and all /i G M. 

(b) I f R F(x)(f\x) dx\ < A, where cp ranges over all C 1 functions of bounded 
support with sup a , GM W{x)\ < 1. 

For the converse, and analogues in R d , see Problem 6* below. 

[Hint: For (a), write F = Fi — F 2 , where Fj are monotonic and bounded. For (b), 
deduce this from (a).] 

31. Let F be the Cantor-Lebesgue function described in Section 3.1. Consider the 
curve that is the graph of F, that is, the curve given by x(t) = t and y(t) = F(t) 
with 0 < t < 1. Prove that the length L(x) of the segment 0 < t < ^ of the curve 
is given by L(x) = x-\- F(x). Hence the total length of the curve is 2. 

32. Let / : IR ^ IR. Prove that / satisfies the Lipschitz condition 

1/ ㈤- /(2/)l < M\x-y\ 

for some M and all a:, y G M, if and only if / satisfies the following two properties: 

(i) / is absolutely continuous. 

(ii) \f\x)\ < M for a.e. x. 


6 Problems 

1. Prove the following variant of the Vitali covering lemma: If E is covered in 
the Vitali sense by a family B of balls, and 0 < m*{E) < 00 , then for every 77 > 0 
there exists a disjoint collection of balls {Bj}j ( L 1 in B such that 

/ 00 \ 00 

m* l E/ \^J Bj \ =0 and \ Bj\ < (1 + r])m*(E). 

\ j=i ) j=i 


2. The following simple one-dimensional covering lemma can be used in a number 
of different situations. 

Suppose h, I2, … ， In is a given finite collection of open intervals in R. Then 
there are two finite sub-collections H ， I’k' and 7^, I 2 ■, • • •, so that each 
sub-collection consists of mutually disjoint intervals and 

N K L 

U a = Ck u Ck. 

j=l k=l £=1 


Note that, in contrast with Lemma 1.2, the full union is covered and not merely a 
part. 
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[Hint: Choose I[ to be an interval whose left end-point is as far left as possible. 
Discard all intervals contained in I[. If the remaining intervals are disjoint from 
J(, select again an interval as far to the left as possible, and call it I r 2 . Otherwise 
choose an interval that intersects I[, but reaches out to the right as far as possible, 
and call this interval Repeat this procedure.] 


3.* There is no direct analogue of Problem 2 in higher dimensions. However, a full 
covering is afforded by the Besicovitch covering lemma. A version of this lemma 
states that there is an integer N (dependent only on the dimension d) with the 
following property. Suppose E is any bounded set in that is covered by a 
collection B of balls in the (strong) sense that for each x G E, there is a, B G B 
whose center is x. Then, there are N sub-collections 石 i, 石 2 ,..., of the original 
collection B, such that each Bj is a collection of disjoint balls, and moreover, 

E C \^j B, where = B± \J B2 U - ■ U Bn- 

BeB f 


4. A real-valued function cp defined on an interval (a, b) is convex if the region 
lying above its graph {(^, y) G M 2 : y > ^(x), a < x < b} is a, convex set, as defined 
in Section 5*, Chapter 1. Equivalently, (p is convex if 

(f(0xi + (1 - 6)x2) < Ocp(xi) + (1 - 0)(f(X2) 

for every x\,X2 G (a, b) and 0 < 0 < 1. One can also observe as a consequence that 
we have the following inequality of the slopes: 

咖 + ") — (p(x) < ip{y) - cp(x ) 〈 ip{y) - cp(y - h) 
h — y _ x — h 


whenever x < y, h > 0, and x -\- h < y. 
The following can then be proved. 


(a) (/? is continuous on (a, b). 

(b) if satisfies a Lipschitz condition of order 1 in any proper closed sub-interval 
[a’ ， 6’] of (a, b). Hence Lp is absolutely continuous in each sub-interval. 

(c) ip’ exists at all but an at most denumerable number of points, and = D + (p 
is an increasing function with 


^(y) - A x )= 



ip f (t) dt. 


(d) Conversely, if ^ is any increasing function on (a, 6), then ip(x) = f x dt 
is a convex function in (a, b) (for c G (a, b)). 


5. Suppose that F is continuous on [a, 6], F\x) exists for every x G (a, 6), and 
F\x) is integrable. Then F is absolutely continuous and 


F(b) - F ⑷ 



F’(x) dx. 





154 


Chapter 3. DIFFERENTIATION AND INTEGRATION 



x x -\- h y — h y 


Figure 13. A convex function 


[Hint: Assume F\x) > 0 for a.e. x. We want to conclude that F(b) > F(a). Let 
E be the set of measure 0 of those x such that F'(x) < 0. Then according to 
Exercise 25, there is a function $ which is increasing, absolutely continuous, and for 
which D + ^(x) = oo, x £ E. Consider F + <5$, for each 5 and apply the result (a) 
in Exercise 23.] 

6. * The following converse to Exercise 30 characterizes functions of bounded vari¬ 
ation. 

Suppose F is a bounded measurable function on IR. If F satisfies either of 
conditions (a) or (b) in that exercise, then F can be modified on a set of measure 
zero so as to become a function of bounded variation on IR. 

Moreover, on we have the following assertion. Suppose F is a bounded 
measurable function on Then the following two conditions on F are equivalent: 

( a， ) fmd 1-^(^ ~\~h) — F(x) \ dx < A\h\, for all h G 

(b’）I F{x)§^dx\ $ A, for all j = 1, … ， d, 

for all (p G C 1 that have bounded support, and for which sup xG]R d \^p(x)\ < 1. 

The class of functions that satisfy either (a’）or (b’）is the extension to R d of 
the class of functions of bounded variation. 

7. Consider the function 

oo 

r / \ \ 、 r» —n 2Tri2 n x 

M x ) = z ^ 2 e - 

n=0 

(a) Prove that /i satisfies \fi(x) — fi(y)\ < A a \x — y\ a for each 0 < a < 1. 

(b) * However, /i is nowhere differentiable, hence not of bounded variation. 
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8.* Let 7Z denote the set of all rectangles in IR 2 that contain the origin, and with 
sides parallel to the coordinate axis. Consider the maximal operator associated to 
this family, namely 


fn(x) 





\f(x-y)\dy. 


(a) Then, / i—>■ does not satisfy the weak type inequality 

m({x-.f^x)>a})<^\\f\\ Ll 


for all a > 0, all integrable /, and some A > 0. 

(b) Using this, one can show that there exists / G L 1 (R) so that for R ^ 


lim sup — - / f (x — y) dy = oo for almost every x. 

diam(R)—0 JJi 


Here diam(R) = \x — y\ equals the diameter of the rectangle. 

[Hint: For part (a), let B be the unit ball, and consider the function (f(x)= 
XB(x)/m(B). For <5 > 0, let (fs(x) = 5~ 2 ip{x/8). Then 

{iPsTAx) ^\x^\ as<5 — 0 ， 

for every (xi,X 2 ), with X 1 X 2 0. If the weak type inequality held, then we would 
have 

< 1 : \x1X2\~ 1 > a}) < —. 

a 

This is a contradiction since the left-hand side is of the order of (log a)/a as a 
tends to infinity.] 





4 Hilbert Spaces: An 
Introduction 


Born barely 10 years ago, the theory of integral equa¬ 
tions has attracted wide attention as much as for its 
inherent interest as for the importance of its applica¬ 
tions. Several of its results are already classic, and no 
one doubts that in a few years every course in analysis 
will devote a chapter to it. 

M. Plancherel, 1912 


There are two reasons that account for the importance of Hilbert 
spaces. First, they arise as the natural infinite-dimensional generaliza¬ 
tions of Euclidean spaces, and as such, they enjoy the familiar properties 
of orthogonality, complemented by the important feature of complete¬ 
ness. Second, the theory of Hilbert spaces serves both as a conceptual 
framework and as a language that formulates some basic arguments in 
analysis in a more abstract setting. 

For us the immediate link with integration theory occurs because of 
the example of the Lebesgue space L 2 (R d ). The related example of 
L 2 ([—7r, 7r]) is what connects Hilbert spaces with Fourier series. The 
latter Hilbert space can also be used in an elegant way to analyze the 
boundary behavior of bounded holomorphic functions in the unit disc. 

A basic aspect of the theory of Hilbert spaces, as in the familiar finite¬ 
dimensional case, is the study of their linear transformations. Given the 
introductory nature of this chapter, we limit ourselves to rather brief 
discussions of several classes of such operators: unitary mappings, pro¬ 
jections, linear functionals, and compact operators. 


1 The Hilbert space L 2 

A prime example of a Hilbert space is the collection of square inte- 
grable functions on which is denoted by L 2 (R d ), and consists of 
all complex-valued measurable functions / that satisfy 



\f(x)\ 2 dx < oo. 
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The resulting L 2 (]R d )-norm of / is defined by 

II/IIl ，) = (1 |/(x)| 2 cfe) • 

The reader should compare those definitions with these for the space 
L 1 (M d ) of integrable functions and its norm that were described in Sec¬ 
tion 2, Chapter 2. A crucial difference is that L 2 has an inner product, 
which L 1 does not. Some relative inclusion relations between those spaces 
are taken up in Exercise 5. 

The space L 2 (R d ) is naturally equipped with the following inner prod¬ 
uct: 


(/，£?)=/ f{x)g{x) dx, whenever f,g e L 2 (R d ), 

JR d 

which is intimately related to the L 2 -norm since 

[f ， f) 1/2 = II/IIl ，， 

As in the case of integrable functions, the condition ||/||l 2 (R d ) =0 only 
implies f(x) = 0 almost everywhere. Therefore, we in fact identify func¬ 
tions that are equal almost everywhere, and define L 2 (R d ) as the space 
of equivalence classes under this identification. However, in practice it is 
often convenient to think of elements in L 2 (R d ) as functions, and not as 
equivalence classes of functions. 

For the definition of the inner product (/, ^) to be meaningful we need 
to know that fg is integrable on M. d whenever / and g belong to L 2 (R d ). 
This and other basic properties of the space of square integrable functions 
are gathered in the next proposition. 

In the rest of this chapter we shall denote the L 2 -norm by || • || (drop¬ 
ping the subscript L 2 (R d )) unless stated otherwise. 

Proposition 1.1 The space L 2 (R d ) has the following properties: 

(i) L 2 (R d ) is a vector space. 

(ii) f(x)g(x) is integrable whenever f^g ^： L 2 (R d ), and the Cauchy- 
Schwarz inequality holds: |(/,^)| < ||/|| || 分 ||. 

(iii) If g ^ L 2 (R d ) is fixed, the map f i—> (f,g) is linear in f, and also 

(/，") = 

(iv) The triangle inequality holds: \\f + g\\ < ||/|| + ||^||. 
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Proof. If f,ge L 2 (R d ), then since \f(x)-\-g(x)\ < 2max(|/(x)|, |^(x)|) 
we have 

1/ ㈤ + g(x)\ 2 < 4(|/(x)| 2 + \g(x)\ 2 ), 

therefore 

J I/ + 5I 2 < 4 J |/| 2 +4 J \g\ 2 < 00 , 

hence / + ^ G L 2 (R d ). Also, if A G C we clearly have A/ G L 2 (M d ), and 
part (i) is proved. 

To see why fg is integrable whenever / and g are in L 2 (R d ), it suffices 
to recall that for all A, B > 0, one has 2AB < A 2 B 2 , so that 

⑴ / + 

To prove the Cauchy-Schwarz inequality, we first observe that if either 
ll/ll = 0 or II^H = 0, then fg = 0is zero almost everywhere, hence (f,g) = 
0 and the inequality is obvious. Next, if we assume that ||/|| = ||^|| = 1, 
then we get the desired inequality \(f ： g)\ < 1- This follows from the fact 
that \(f,g)\ < f \fg\, and inequality (1). Finally, in the case when both 
ll/ll and II 分 II are non-zero, we normalize / and g by setting 

/ = //ll/ll and g^g/\\g\\, 

so that ll/ll = II^H = 1. By our previous observation we then find 

1(/, 5)1 < 1- 

Multiplying both sides of the above by ||/|| ||^|| yields the Cauchy- Schwarz 
inequality. 

Part (iii) follows from the linearity of the integral. 

Finally, to prove the triangle inequality, we use the Cauchy-Schwarz 
inequality as follows: 

II/ + 0" 2 ^ (f + 9,f + 9) 

=ll/ll 2 + (/，") + (5，/) + llff" 2 

<n/ir+2i(/, 5 )i + ii 5 ii 2 
< ll/ll 2 + 2 ll/ll IMI + IMI 2 
二 （ ll/ll + IMI ) 2 ， 


and taking square roots completes the argument. 
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We turn our attention to the notion of a limit in the space L 2 {R d ). 
The norm on L 2 induces a metric d as follows: if /, ^ G L 2 (R d )^ then 

d(f ： g) = 11/ — ^||L 2 (R d )- 

A sequence {/ n } C L 2 {R d ) is said to be Cauchy if ^ fm^) ^ 0 9S 
n, m ^ oo. Moreover, this sequence converges to f G L 2 (R d ) if d(/ n , /)—> 
0 as n —> oo. 

Theorem 1.2 The space L 2 (R d ) is complete in its metric. 

In other words, every Cauchy sequence in L 2 (R d ) converges to a function 
in L 2 (R d ). This theorem, which is in sharp contrast with the situation for 
Riemann integrable functions, is a graphic illustration of the usefulness 
of Lebesgue’s theory of integration. We elaborate on this point and its 
relation to Fourier series in Section 3 below. 

Proof. The argument given here follows closely the proof in Chapter 2 
that L 1 is complete. Let {/ n }^=i be a Cauchy sequence in L 2 , and 
consider a subsequence {/ n/e }^ =1 of {/ n } with the following property: 

||/n fe+1 - /nj| < 2- fc , for all fc > 1. 

If we now consider the series whose convergence will be seen below, 

oo 

fix ) 二 f ni (x) + ^2(fn k+1 (x) - fn k (X)) 
k=l 

and 

oo 

9(x) = \f ni (x)\ + l(/n fc+1 ㈤— /n fc (aO)l, 

k=l 

together the partial sums 

K 

5V(/)0) = / 叫 ㈤ + J2(fn k+1 (X) — fn k (x)) 

k=l 

and 

K 

S K (g)(x) - \f ni (x)\ + \fn k +i( X ) - fn k ( X )\： 
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then the triangle inequality implies 


||SV(")|| S ll/nill + ^2 ll/nfe+i _ fn k 


k=l 

K 


<||/ ni || + ^ 2 - fe . 


Letting K tend to infinity, and applying the monotone convergence theo¬ 
rem proves that f \g\ 2 < oo, and since |/| < we must have / G L 2 (R d ). 

In particular, the series defining / converges almost everywhere, and 
since (by construction of the telescopic series) the (K — l) th partial sum 
of this series is precisely f nK ’ we find that 


fn k (X) — f(x) 


a.e. x. 


To prove that f nk —>• / in L 2 (R d ) as well, we simply observe that \f — 
Sk{I )\ 2 ^ (2^) 2 for all K, and apply the dominated convergence theorem 
to get \\fn k — /|| —>• 0 as fc tends to infinity. 

Finally, the last step of the proof consists of recalling that {/ n } is 
Cauchy. Given e, there exists N such that for all n,m > N we have 
||/n - /m|| < e/2. If n k is chosen so that n k > N, and ||/n fc - f\\ < e/2, 
then the triangle inequality implies 


\\fn - /|| < \\fn - fn k \\ + \\fn k ~ f\\ < ^ 


whenever n > N. This concludes the proof of the theorem. 

An additional useful property of L 2 (R d ) is contained in the following 
theorem. 

Theorem 1.3 The space L 2 (R d ) is separable, in the sense that there 
exists a countable collection {fk} of elements in L 2 (R d ) such that their 
linear combinations are dense in L 2 (R d ). 

Proof. Consider the family of functions of the form txr(x), where r 
is a complex number with rational real and imaginary parts, and R is 
a rectangle in with rational coordinates. We claim that finite linear 
combinations of these type of functions are dense in L 2 (R d ). 

Suppose / G L 2 (R d ) and let e > 0. Consider for each n > 1 the func¬ 
tion g n defined by 
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Then 1/ — g n \ 2 < 4|/| 2 and g n {x) f(x) almost everywhere. 1 The dom¬ 
inated convergence theorem implies that \\f — > 0 as n tends 

to infinity; therefore we have 


11/ _ 仍 v|U 2 (M d ) < e /2 for some N. 


Let g = and note that ^ is a bounded function supported on a 
bounded set; thus g G _L 1 (M d ). We may now find a step function (f so 
that \(p\ < N and J \g — ^\ < e 2 /16N (Theorem 2.4, Chapter 2). By re¬ 
placing the coefficients and rectangles that appear in the canonical form 
of (f by complex numbers with rational real and imaginary parts, and 
rectangles with rational coordinates, we may find a ^ with |^| < N and 
J \g — ^\ < e 2 /8N. Finally, we note that 



Consequently ||p — #|| < e/2, therefore \\f — ^\\ < e, and the proof is 
complete. 

The example L 2 (M d ) possesses all the characteristic properties of a 
Hilbert space, and motivates the definition of the abstract version of this 
concept. 

2 Hilbert spaces 

A set 7Y is a Hilbert space if it satisfies the following: 

(i) 7Y is a vector space over C (or M). 2 

(ii) H is equipped with an inner product so that 

• / ^ (/, g) is linear on H for every fixed g 


• (/，") = 

• (/，/)> 0 for all feH. 
We let ll/ll = (/ ， /)"2. 


(iii) ll/ll = 0 if and only if / = 0. 


1 By definition / E L 2 (R d ) implies that |/| 2 is integrable, hence f(x) is finite for a.e x. 

2 At this stage we consider both cases, where the scalar field can be either C or M. 
However, in many applications, such as in the context of Fourier analysis, one deals 
primarily with Hilbert spaces over C. 
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(iv) The Cauchy-Schwarz and triangle inequalities hold 


|(/,5)|< ll/ll ||9|| and ||/ + ㈣ |/|_| 

for all 

(v) H is complete in the metric d(f,g) = \\f — g\\. 

(vi) H is separable. 

We make two comments about the definition of a Hilbert space. First, 
the Cauchy-Schwarz and triangle inequalities in (iv) are in fact easy 
consequences of assumptions (i) and (ii). (See Exercise 1.) Second, we 
make the requirement that TL be separable because that is the case in 
most applications encountered. That is not to say that there are no 
interesting non-separable examples; one such example is described in 
Problem 2. 

Also, we remark that in the context of a Hilbert space we shall of¬ 
ten write lim n — oo / n = / or / n ^ / to mean that lim n — oo \\fn - f\\ = 0, 
which is the same as d(/ n , /) —> 0. 

We give some examples of Hilbert spaces. 

Example 1. If 五 is a measurable subset of M. d with m(E) > 0, we let 
L 2 (E) denote the space of square integrable functions that are supported 
on E, 


L 2 (E) = supported on E, so that J \f{x)\ 2 dx < oo 
The inner product and norm on L 2 (E) are then 

(f ， 9)= f f{x)g(x)dx and ||/|| = ( [ \f{x)\ 2 dx 


1/2 


IE 


IE 


Once again, we consider two elements of L 2 (E) to be equivalent if they 
differ only on a set of measure zero; this guarantees that ||/|| = 0 implies 
f = 0. The properties (i) through (vi) follow from these of L 2 (R d ) proved 
above. 


Example 2. A simple example is the finite-dimensional complex Eu¬ 
clidean space. Indeed, 


= {(ai,..., cln) •• cik G C} 
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becomes a Hilbert space when equipped with the inner product 



where a = (ai,..., ajsf) and b = ( 61 , … ， bjsr) are in C N . The norm is then 



One can formulate in the same way the real Hilbert space 

Example 3. An infinite-dimensional analogue of the above example is 
the space £ 2 (Z). By definition 





If we denote infinite sequences by a and 6 , the inner product and norm 
on £ 2 (Z) are 



We leave the proof that £ 2 (Z) is a Hilbert space as Exercise 4. 

While this example is very simple, it will turn out that all infinite¬ 
dimensional (separable) Hilbert spaces are £ 2 (Z) in disguise. 

Also, a slight variant of this space is 彳 2 (N), where we take only one¬ 
sided sequences, that is, 



The inner product and norm are then defined in the same way with the 
sums extending from n = 1 to oo. 

A characteristic feature of a Hilbert space is the notion of orthogo¬ 
nality. This aspect, with its rich geometric and analytic consequences, 
distinguishes Hilbert spaces from other normed vector spaces. We now 
describe some of these properties. 
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2.1 Orthogonality 

Two elements / and ^ in a Hilbert space 7i with inner product (.，.）are 

orthogonal or perpendicular if 

(/, g) = 0, and we then write fig. 

The first simple observation is that the usual theorem of Pythagoras 
holds in the setting of abstract Hilbert spaces: 

Proposition 2.1 If f 1 g, then i|/ + 5 || 2 = |/|| 2 + \\g\\ 2 . 

Proof. It suffices to note that (/, p) = 0 implies (g, f) = Q, and there¬ 
fore 

11/ + q\\ 2 = (/ + "，/ + ") = ll/ll 2 + (/，"）+ ("，/) + Ibll 2 
=ll/ll 2 + II ， 


A finite or countably infinite subset {ei, e 2 ,...} of a Hilbert space 1~L 

is orthonormal if 


(efc, e^) 



when k = i, 
when k ^ L 


In other words, each has unit norm and is orthogonal to whenever 
£ ^ k. 


Proposition 2.2 //{e； c }^ =1 is orthonormal, and f = a k^k ^ ^ where 
the sum is finite, then 

ii/ii 2 = i afc i 2 . 

The proof is a simple application of the Pythagorean theorem. 

Given an orthonormal subset {ei,e 2 , •••} = { e k}^=i of 7Y, a natural 
problem is to determine whether this subset spans all of 7Y, that is, 
whether finite linear combinations of elements in {ei, e 2 ,...} are dense 
in H. If this is the case, we say that {ek}^ = i is an orthonormal basis 
for If we are in the presence of an orthonormal basis, we might expect 
that any / G 7Y takes the form 


f = 〉: afcCfc ， 
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for some constants G C. In fact, taking the inner product of both 
sides with ej, and recalling that is orthonormal yields (formally) 

(f ， e j) = a j. 

This question is motivated by Fourier series. In fact, a good insight 
into the theorem below is afforded by considering the case where TL 
is L 2 ([—7r, 7r]) with inner product (/, ^) = ^ f(x)g(x) dx^ and the 
orthonormal set {ek} ( ^ =1 is merely a relabeling of the exponentials 

f p inx\oo 

J n=—oo' 

Adapting the notation used in Fourier series, we write / ~ ^2k=i a k e k^ 
where aj = (/, ej) for all j. 

In the next theorem, we provide four equivalent characterizations that 
{e^} is an orthonormal basis for TL. 

Theorem 2.3 The following properties of an orthonormal set {ek}^ =1 
are equivalent. 

(i) Finite linear combinations of elements in {e/c} are dense in 7i. 

(ii) If f and (f, ej) =0 for all j, then / = 0. 

(iii) If f eH, and S N (f) = Ylk=i a k e k, where a k ^ (f,e k ), then S N (f) 
f as N ^ oo in the norm. 

(iv) If a k = (f,e k ), then ||/|| 2 = Y17=i \ a k\ 2 - 

Proof. We prove that each property implies the next, with the last 
one implying the first. 

We begin by assuming (i). Given / G 7Y with (/, ej) = 0 for all j, we 
wish to prove that / = 0. By assumption, there exists a sequence {g n } 
of elements in T~L that are finite linear combinations of elements in {e^}, 
and such that \\f — g n \\ tends to 0 as n goes to infinity. Since (/, ej) = 0 
for all j, we must have (/, g n ) = 0 for all n; therefore an application of 
the Cauchy-Schwarz inequality gives 

ll/ll 2 = (/，/) = UJ-9n) < ll/ll \\f -9n\\ for all n. 

Letting n —• oo proves that ||/|| 2 = 0; hence / = 0, and (i) implies (ii). 
Now suppose that (ii) is verified. For f E H we define 
N 

Sn(J) = ^2 where a k = (/, e fc ), 
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and prove first that SW(/) converges to some element g d Indeed, 
one notices that the definition of ak implies (/— 5V(/)) 丄 SW(/), so 
the Pythagorean theorem and Proposition 2.2 give 

N 

(2) ll/f = 11/ — 知 (/)|| 2 + || 知 (/)|| 2 = 11/ — ^(/)f + ^ |a fc | 2 . 

k = l 

Hence ||/|| 2 > |afc| 2 ? and letting N tend to infinity we obtain Bessel’s 

inequality 

oo 

Ew 2 _ 2 ， 

k=l 

which implies that the series Ylh=i l a fc| 2 converges. Therefore, {5V(/)} 芨 =i 
forms a Cauchy sequence in H since 

N 

||*5jv(/) - S M {f)\\ 2 = ^ |a fc | 2 whenever N > M. 
fc=M+l 

Since H is complete, there exists g such that SW(/)— > 分 as iV tends 
to infinity. 

Fix j, and note that for all sufficiently large N, (/— SW(/), 勺 ） = 
aj — aj = 0. Since SW(/) tends to g : we conclude that 

(f ~ 9 ^j) = 0 for all j. 

Hence f = gby assumption (ii), and we have proved that / = Y1T=i a k e k- 
Now assume that (iii) holds. Observe from (2) that we immediately 
get in the limit as N goes to infinity 

oo 

ll/ll 2 = l afc l 2 . 

k=l 

Finally, if (iv) holds, then again from (2) we see that \\f — 5at(/)|| 
converges to 0. Since each *Sjv(/) is a finite linear combination of elements 
in {e/c}, we have completed the circle of implications, and the theorem 
is proved. 

In particular, a closer look at the proof shows that Bessel’s inequality 
holds for any orthonormal family {e/c}. In contrast, the identity 


ll/ll 2 二 XI l afc | 2 ， where a k = (/, e fc ) 
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which is called Parseval’s identity, holds if and only if {ek}^ =1 is also 
an orthonormal basis. 


Now we turn our attention to the existence of a basis. 


Theorem 2.4 Any Hilbert space has an orthonormal basis. 

The first step in the proof of this fact is to recall that (by definition) 
a Hilbert space H is separable. Hence, we may choose a countable col¬ 
lection of elements T = {hk} in so that finite linear combinations of 
elements in T are dense in l~i. 

We start by recalling a definition already used in the case of finite¬ 
dimensional vector spaces. Finitely many elements ^i ,...,gN are said to 
be linearly independent if whenever 


aigi + • • • + a^gN = 0 for some complex numbers a^, 

then ai = 句 = … =ajv = 0. In other words, no element gj is a lin¬ 
ear combination of the others. In particular, we note that none of the 
gj can be 0. We say that a countable family of elements is linearly 
independent if all finite subsets of this family are linearly independent. 

If we next successively disregard the elements hk that are linearly 
dependent on the previous elements "i, " 2 , … ， "fc-i, then the result¬ 
ing collection hi = fu f 2 , … ， fk ， … consists of linearly independent ele¬ 
ments, whose finite linear combinations are the same as those given by 
" 1 ， " 2 , … ， "fc, …， and hence these linear combinations are also dense in 
H. 

The proof of the theorem now follows from an application of a familiar 
construction called the Gram-Schmidt process. Given a finite family 
of elements {/ 1 , … ，九 } we call the span of this family the set of all 
elements which are finite linear combinations of the elements {/ 1 , … ， 九 }. 
We denote the span of {/ 1 ,..., f k } by Span({/i, •.. ， f k }). 

We now construct a sequence of orthonormal vectors ei,e 2 , … such 
that Span({ei,..., e n }) = Span({/i,..., / n }) for all n > 1 . We do this 
by induction. 

By the linear independence hypothesis, /1 7 ^ 0, so we may take e\ = 
/ 1 /II/ 1 II. Next, assume that orthonormal vectors ei,..., have been 
found such that Span({ei,..., e^}) = Span({/i, … ，九 }) for a given k. 
We then try e f k+1 as fk+i + a j e j' To have « + 1 ，勺 ） = 0 requires 
that aj = — (/fc+i, and this choice of aj for I < j < k assures that 
e ’ fc+1 is orthogonal to ei,..., e^. Moreover our linear independence hy¬ 
pothesis assures that 7 ^ 0 ; hence we need only “renormalize” and 
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take e/c+i = e’fc+i/||e^ +1 || to complete the inductive step. With this we 
have found an orthonormal basis for 7i 

Note that we have implicitly assumed that the number of linearly in¬ 
dependent elements /i, / 2 ,... is infinite. In the case where there are only 
N linearly independent vectors /i,..., /at, then ei, …， ejv constructed 
in the same way also provide an orthonormal basis for 7i. These two 
cases are differentiated in the following definition. If 7Y is a Hilbert space 
with an orthonormal basis consisting of finitely many elements, then we 
say that is finite-dimensional. Otherwise T~l is said to be infinite¬ 
dimensional. 

2.2 Unitary mappings 

A correspondence between two Hilbert spaces that preserves their struc¬ 
ture is a unitary transformation. More precisely, suppose we are given 
two Hilbert spaces 1~L and H’ with respective inner products (•, -)n and 
(•, ')n^ and the corresponding norms || . \\u and || . \\^. A mapping 
U :7i ^ 7i ; between these space is called unitary if: 

(i) U is linear, that is, U(af + /3g) = aU(f) + (3U(g). 

(ii) t/ is a bijection. 

(iii) I«7/|| W = Wf\\n for all fen. 

Some observations are in order. First, since U is bijective it must 
have an inverse U~ 1 : 7Y / —> 7Y that is also unitary. Part (iii) above also 
implies that if U is unitary, then 

(Uf, Ug)w = {f,g)n for all f,g eH. 

To see this, it suffices to “polarize,” that is, to note that for any vector 
space (say over C) with inner product and norm || • ||, we have 

\\F + G\\ 2 -\\F-G\\ 2 + z(||^ + G|| 2 — II $ — Gf) 

whenever F and G are elements of the space. 

The above leads us to say that the two Hilbert spaces H and H r are 
unitarily equivalent or unitarily isomorphic if there exists a unitary- 
mapping U : 7i — > Ti! . Clearly, unitary isomorphism of Hilbert spaces is 
an equivalence relation. 

With this definition we are now in a position to give precise meaning 
to the statement we made earlier that all infinite-dimensional Hilbert 
spaces are the same and in that sense £ 2 (Z) in disguise. 


^ G )-\ 
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Corollary 2.5 Any two infinite-dimensional Hilbert spaces are unitarily 
equivalent. 


Proof. If 7i and 7i f are two infinite-dimensional Hilbert spaces, we 
may select for each an orthonormal basis, say 


{ei, e 2 , ...} CH and {e^, e^,. ..} C H'. 

Then, consider the mapping defined as follows: if f = a k e k^ then 

oo 

U{f) = g, where 9 ^^2 afce ’ fc . 

k=l 

Clearly, the mapping U is both linear and invertible. Moreover, by Par¬ 
se vaFs identity, we must have 

oo 

\\ u f\\h' = h\\w = XI = ll/llw> 

k=l 

and the corollary is proved. 

Consequently, all infinite-dimensional Hilbert spaces are unitarily equiv¬ 
alent to £ 2 (N), and thus, by relabeling, to £ 2 (Z). By similar reasoning 
we also have the following: 


Corollary 2.6 Any two finite-dimensional Hilbert spaces are unitarily 
equivalent if and only if they have the same dimension. 

Thus every finite-dimensional Hilbert space over C (or over M) is equiv¬ 
alent with C d (or R d ), for some d. 

2.3 Pre-Hilbert spaces 

Although Hilbert spaces arise naturally, one often starts with a pre- 
Hilbert space instead, that is, a space Tlo that satisfies all the defining 
properties of a Hilbert space except (v); in other words 7io is not assumed 
to be complete. A prime example arose implicitly early in the study of 
Fourier series with the space Tio = TZ oi Riemann integrable functions 
on [—7r, 7r] with the usual inner product; we return to this below. Other 
examples appear in the next chapter in the study of the solutions of 
partial differential equations. 

Fortunately, every pre-Hilbert space Ho can be completed. 
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Proposition 2.7 Suppose we are given a pre-Hilbert space Ho with in¬ 
ner product (•, _)o. Then we can find a Hilbert space Ti with inner product 
(■, •) such that 

(i) Wo C 

(ii) (/, g)o = (/, g) whenever f,g e H 0 . 

(iii) Ho is dense in H. 

A Hilbert space satisfying properties like H in the above proposition is 
called a completion of Hq. We shall only sketch the construction of 
T~L, since it follows closely Cantor’s familiar method of obtaining the real 
numbers as the completion of the rationals in terms of Cauchy sequences 
of rationals. 

Indeed, consider the collection of all Cauchy sequences {f n } with f n G 
7^o, 1 < n < oo. One defines an equivalence relation in this collection 
by saying that {/ n } is equivalent to {f r n } if f n — f n converges to 0 as 
n —^ oo. The collection of equivalence classes is then taken to be l~i. One 
then easily verifies that inherits the structure of a vector space, with 
an inner product (J ， g) defined as lim n —oo(/n ， "n), where {/ n } and {g n } 
are Cauchy sequences in 7Yo, representing, respectively, the elements / 
and g in T~L. Next, if / G Ho we take the sequence {/ n }, with f n = f for 
all n, to represent / as an element of H, giving Ho C H. To see that 
H is complete, let {F k }^ =1 be a Cauchy sequence in H, with each F k 
represented by {/^}^ =1 , G Ho- If we define F G 7Y as represented by 
the sequence {/ n } with f n = /^ (n) , where N(n) is so that |/^ (n) - f^\ < 
1 /n for j > iV(n), then we note that F k F in 7i. 

One can also observe that the completion H of Ho is unique up to 
isomorphism. (See Exercise 14.) 

3 Fourier series and Fatou’s theorem 

We have already seen an interesting relation between Hilbert spaces and 
some elementary facts about Fourier series. Here we want to pursue this 
idea and also connect it with complex analysis. 

When considering Fourier series, it is natural to begin by turning to 
the broader class of all integrable functions on [—7r, 7r]. Indeed, note that 
L 2 ([—7r,7r]) C I/ 1 ([—7r,7r]), by the Cauchy-Schwarz inequality, since the 
interval [—7r, tt] has finite measure. Thus, if f G L 1 ([—7r, 7r]) and n G Z, 
we define the n th Fourier coefficient of / by 
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The Fourier series of / is then formally a n e inx ， and we write 


oo 

f[x) 〜 a ， inx 

n=—oo 


to indicate that the sum on the right is the Fourier series of the func¬ 
tion on the left. The theory developed thus far provides the natural 
generalization of some earlier results obtained in Book I. 

Theorem 3.1 Suppose f is integrable on [—7r, 7r]. 

(i) If a n = 0 for all n, then f(x) = 0 for a.e. x. 

(ii) Yl^L-oo ^n^ n ^e inx tends to f(x) for a.e. x, as r ^ 1, r < 1. 

The second conclusion is the almost everywhere “Abel summability” to 
/of its Fourier series. Note that since \a n \ < \ f(x)\ dx, the series 

^2 CL n r^e inx converges absolutely and uniformly for each r, 0 < r < 1. 

Proof. The first conclusion is an immediate consequence of the second. 
To prove the latter we recall the identity 


r \ri\ e iny 


Pr(y) 


1 — r z 


1 — 2r cos y -\- r 2 


for the Poisson kernel; see Book I, Chapter 2. Starting with our given 


/ G L 1 ([—7r, 7r]) we extend it as a function on R by making it periodic of 
period 2n. 3 We then claim that for every x 


(3) 



n=—oo 



f(x-y)P r (y)dy. 


Indeed, by the dominated convergence theorem the right-hand side equals 

士 「 f(x — y)e_dy. 


Moreover, for each x and n 


f(x- y)e mv dy 


f*7T-\~X 


f(y)e in ^ dy 


-n-\-x 


f[ y )e- in y dy 二 e inx 2Tm r , 


3 Note that we may without loss of generality assume that f(n) = /(— 7r) so as to make 
the periodic extension unambiguous. 
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The first equality follows by translation invariance (see Section 3, Chap¬ 
ter 2), and the second since J : 丌 F(y) dy = F{y) dy whenever F is peri¬ 
odic of period 2n and I is an interval of length 2tt (Exercise 3, Chapter 2). 
With these observations, the identity (3) is established. We can now in¬ 
voke the facts about approximations to the identity (Theorem 2.1 and 
Example 4, Chapter 3) to conclude that the left-hand side of (3) tends to 
f(x) at every point of the Lebesgue set of /, hence almost everywhere. 
(To be correct, the hypotheses of the theorem require that / be integrable 
on all of R. We can achieve this for our periodic function by setting / 
equal to zero outside [—27r, 27 t], and then (3) still holds for this modified 
/, whenever x G [—7r,7r].) 

We return to the more restrictive setting of L 2 . We express the essen¬ 
tial conclusions of Theorem 2.3 in the context of Fourier series. With 
/ G L 2 ([—7r,7r]), we write as before = 去 工二 f(x)e~ inx dx. 

Theorem 3.2 Suppose f G L 2 ([—7r, 7r]). Then: 

(i) We have ParsevaVs relation 

f |an| 2 = S I \f(x)\ 2 dx. 

n=—oo _丌 

(ii) The mapping f i—> {a n } is a unitary correspondence between 
L 2 ([—7r, 7r]) and £ 2 (Z). 

(iii) The Fourier series of f converges to f in the L 2 - norm, that is, 

f \f( (ii) (iii) * * * * * * x ) - S N(f)(x)\ 2 dx ^0 as N — oo, 
where S N (f) = Y,\ n \<N a n einx • 

To apply the previous results, we let TL = L 2 ([—7r,7r]) with inner prod¬ 

uct (f,g) = ^ f(x)g(x) dx, and take the orthonormal set {e/ c }^ =1 
to be the exponentials with k = 1 when n = 0, k = 2n for 

n > 0, and k = 2\n\ — 1 for n < 0. 

By the previous result, assertion (ii) of Theorem 2.3 holds and thus 

all the other conclusions hold. We therefore have Parseval’s relation, 

and from (iv) we conclude that ||/— *Sjv(/)|| 2 = X^|n|>iv l a n| 2 —^ 0 as 
N —> oo. Similarly, if {a n } G £ 2 (Z) is given, then ||SW(/) — *Sm(/)|| 2 ~^ 
0, as N, M —>■ oo. Hence the completeness of L 2 guarantees that there is 
an / G L 2 such that \\f — SV(/)|| ~^ 0, and one verifies directly that / 
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has {a n } as its Fourier coefficients. Thus we deduce that the mapping 
f h-{ a n } is onto and hence unitary. This is a key conclusion that holds 
in the setting on L 2 and was not valid in an earlier context of Riemann 
integrable functions. In fact the space TZ of such functions on [― 丌， 丌 ] is 
not complete in the norm, containing as it does the continuous functions, 
but TZ is itself restricted to bounded functions. 

3.1 Fatou’s theorem 

Fatou’s theorem is a remarkable result in complex analysis. Its proof 
combines elements of Hilbert spaces, Fourier series, and deeper ideas of 
differentiation theory, and yet none of these notions appear in its state¬ 
ment. The question that Fatou’s theorem answers may be put simply as 
follows. 

Suppose F(z) is holomorphic in the unit disc D = { 2 ： G C : 

|z| < 1}. What are conditions on F that guarantee that F(z) 
will converge, in an appropriate sense, to boundary values 
F(e ie ) on the unit circle? 

In general a holomorphic function in the unit disc can behave quite 
erratically near the boundary. It turns out, however, that imposing a 
simple boundedness condition is enough to obtain a strong conclusion. 

If F is a function defined in the unit disc D, we say that F has a radial 
limit at the point —it < 6 < tt on the circle, if the limit 


lim F(re ie ) 


exists. 


Theorem 3.3 A bounded holomorphic function F{re l6 ) on the unit disc 
has radial limits at almost every 9. 

Proof. We know that F(z) has a power series expansion a n z n in 

D that converges absolutely and uniformly whenever 2 ; = re lG and r < 1. 
In fact, for r < 1 the series a n r n e ind is the Fourier series of the 

function F(re t0 )， that is, 



and the integral vanishes when n < 0. (See also Chapter 3, Section 7 in 
Book II). 


174 


Chapter 4. HILBERT SPACES: AN INTRODUCTION 


We pick M so that |F(z)| < M, for all 2 ： G O. By Parseval’s identity 

OO /*7T 

y^|a n | 2 r 2n = — / \F{re ie )\ 2 d6 for each 0 < r < 1. 

27T / ^ 

n=0 J_7V 

Letting r —^ 1 one sees that ^ |a n | 2 converges (and is < M 2 ). We now let 
F(e ie ) be the L 2 -function whose Fourier coefficients are a n when n > 0, 
and 0 when n < 0. Hence by conclusion (ii) in Theorem 3.1 

OO 

a n r n e in6 —>• F(e ie ), for a.e 0, 

n=0 

concluding the proof of the theorem. 

If we examine the argument given above we see that the same conclu¬ 
sion holds for a larger class of functions. In this connection, we define 
the Hardy space H 2 (D) to consist of all holomorphic functions F on 
the unit disc D that satisfy 

sup f |F(re^)| 2 d6 < oo. 

0<r<l 27T J_ n 

We also define the “norm” for functions F in this class, ||F||^ 2 ( D ), to be 
the square root of the above quantity. 

One notes that if F is bounded, then F G ff 2 (D), and moreover the 
conclusion of the existence of radial limits almost everywhere holds for 
any F G i/ 2 (D), by the same argument given for the bounded case. 4 Fi¬ 
nally, one notes that F G H 2 (B) if and only if F(z) = with 

l»n| 2 < oo; moreover, \ a n\ 2 = ||-F||^ 2 ( D ). This states in par¬ 

ticular that i7 2 (D) is in fact a Hilbert space that can be viewed as the 
“subspace” £ 2 (Z + ) of £ 2 (Z), consisting of all {a n } G £ 2 (Z), with a n = 0 
when n < 0. 

Some general considerations of subspaces and their concomitant or¬ 
thogonal projections will be taken up next. 

4 Closed subspaces and orthogonal projections 

A linear subspace S (or simply subspace) of 7Y is a subset of H that 
satisfies af f3g ^ S whenever f,gGS and a,/3 are scalars. In other 
words, S is also a vector space. For example in M 3 , lines passing through 


4 An even more general statement is given in Problem 5*. 
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the origin and planes passing through the origin are the one-dimensional 
and two dimensional subspaces, respectively. 

The subspace S is closed if whenever {/ n } C S converges to some 
/ G 7Y, then / also belongs to S. In the case of finite-dimensional Hilbert 
spaces, every subspace is closed. This is, however, not true in the gen¬ 
eral case of infinite-dimensional Hilbert spaces. For instance, as we 
have already indicated, the subspace of Riemann integrable functions 
in L 2 ([— 7r,7r]) is not closed, nor is the subspace obtained by fixing a ba¬ 
sis and taking all vectors that are finite linear combinations of these basis 
elements. It is useful to note that every closed subspace S oi His itself a 
Hilbert space, with the inner product on S that which is inherited from 
H. (For the separability of 5, see Exercise 11.) 

Next, we show that a closed subspace enjoys an important character¬ 
istic property of Euclidean geometry. 

Lemma 4.1 Suppose S is a closed subspace of TL and f El~L. Then: 

(i) There exists a (unique) element go ^ S which is closest to f, in the 
sense that 

II/ — 如|卜 inf ||/-5||. 

(ii) The element f — go is perpendicular to S, that is, 

(/ - 9o,g) = 0 for all g e S. 

The situation in the lemma can be visualized as in Figure 1. 


f 



S 


Figure 1. Nearest element to / in 5 
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Proof. If / G then we choose f = go, and there is nothing left 
to prove. Otherwise, we let d = inf ge s 11/ — 沒 ||, and note that we must 
have d > Q since f 丰 S and S is closed. Consider a sequence {gn}^=i in 
S such that 

\\f ~ 9n\\ ^ d as n —^ oo. 

We claim that {g n } is a Cauchy sequence whose limit will be the desired 
element go. In fact, it would suffice to show that a subsequence of {gn} 
converges, and this is immediate in the finite-dimensional case because 
a closed ball is compact. However, in general this compactness fails, as 
we shall see in Section 6, and so a more intricate argument is needed at 
this point. 

To prove our claim, we use the parallelogram law, which states that 
in a Hilbert space H 

(4) \\A + B\\ 2 + ||A - B\\ 2 - 2 [||A|| 2 + ||B|| 2 ] for all A, Ben. 

The simple verification of this equality, which consists of writing each 
norm in terms of the inner product, is left to the reader. Putting A = 
f — g n and B = f — in the parallelogram law, we find 

||2 / — (g n + 9m)\\ 2 + Wdm — 9n\\ 2 = 2 [\\f — g n \\ 2 + 11/ — 9m \\ 2 ] - 
However is a subspace, so the quantity \{g n + 9m) belongs to 5, hence 

||2/ — (g n + 9m)\\ = 2||/ — - {g n + 9m)\\ ^ 2d. 

Therefore 

Wdm — 9n\\ 2 = 2 [11/ — g n \\ 2 + 11/ — 9m\\ 2 ] — ||2 / — (g n + 9m)\\ 2 
^ 2 [11/ — g n \\ 2 + 11/ — gm\\ 2 ] — 4d 2 . 

By construction, we know that \\f — ^n|| d and \\f — ^ m || —^ d as n, m —>• 
oo, so the above inequality implies that {^ n } is a Cauchy sequence. Since 
H is complete and S closed, the sequence {" n } must have a limit go in 
5, and then it satisfies d = ||/ — ^o||- 

We prove that if ^ G 5, then 沒丄 (J — go). For each e (positive or neg¬ 
ative) ,consider the perturbation of go defined by go — eg. This element 
belongs to 5, hence 


\\f-(9o-eg)\\ 2 >\\f-g 0 \\ 2 . 
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Since \\f - (g 0 - eg)\\ 2 = \\f - g 0 \\ 2 + e 2 ||^|| 2 + 2eRe (/ - g 0 ,g), we find 
that 


⑸ 2eRe (/ — go^g) + e 2 ||^|| 2 > 0. 

If Re (/ — g) < 0, then taking e small and positive contradicts (5). 
If Re (/ — go ： g) > 0, a contradiction also follows by taking e small and 
negative. Thus Re (/ — g) = 0. By considering the perturbation go — 
ieg, a similar argument gives Im( / — go, g) = 0, and hence (/ — go, g)= 
0. 

Finally, the uniqueness of go follows from the above observation about 
orthogonality. Suppose go is another point in S that minimizes the 
distance to /. By taking g = go — go in our last argument we find 
(/ — 分 o) 丄 （Po — 9o)^ and the Pythagorean theorem gives 

\\f-~go\\ 2 ^if-9o\\ 2 + \\9o-~9of. 

Since by assumption ||/ _ go\\ 2 = \\f — go\\ 2 ^ we conclude that ||po — ^o|| = 
0, as desired. 

Using the lemma, we may now introduce a useful concept that is an¬ 
other expression of the notion of orthogonality. If 5 is a subspace of a 
Hilbert space 7Y, we define the orthogonal complement of S by 


5 丄 ={/ € W : (/ ， g) = 0 for all g G S}. 

Clearly, S 1 - is also a subspace of H, and moreover S fl = {0}. To see 
this, note that if / G 5 H S 丄 , then / must be orthogonal to itself; thus 
0 = (/，/) = ll/ll, and therefore / = 0. Moreover, tS 丄 is itself a closed 
subspace. Indeed, if / n ^ /, then (J n ， g) — (f,g) for every g, by the 
Cauchy-Schwarz inequality. Hence if (/ n ,= 0 for all ^ G 5 and all n, 
then (/, ^) = 0 for all those g. 

Proposition 4.2 If S is a closed sub space of a Hilbert space 7i, then 

The notation in the proposition means that every / G 7Y can be written 
uniquely as f = g h, where g ^ S and h G tS 丄； we say that H is the 
direct sum of S and This is equivalent to saying that any f in H 
is the sum of two elements, one in 5, the other in 5' and that S D S 1 ' 
contains only 0. 

The proof of the proposition relies on the previous lemma giving the 
closest element of / in <S. In fact, for any / G 7Y, we choose g 0 as in the 
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lemma and write 

f = 9o + (f - go). 

By construction go G and the lemma implies f — go C 丄 ， and this 
shows that / is the sum of an element in S and one in 5 丄 . To prove that 
this decomposition is unique, suppose that 

f = g-\-h = g-\-h where g^g ^ S and h，h E 丄 . 

Then, we must have g — g = h — h. Since the left-hand side belongs to 
S while the right-hand side belongs to S 1 - the fact that S D S 1 - = {0} 
implies g — g = 0 and h — h = 0. Therefore g = g and h = h and the 
uniqueness is established. 

With the decomposition H = S ㊉ S L one has the natural projection 
onto S defined by 

Ps(f) = 9, where f = g h and g ^ S, h E S 丄 . 

The mapping Ps is called the orthogonal projection onto S and sat¬ 
isfies the following simple properties: 

(i) Ps(f) is linear, 

(ii) Ps{f) = f whenever / G 5, 

(iii) Ps(f) = 0 whenever / G 丄， 

(iv) ||^s(/)||<|/|| for all/GW. 

Property (i) means that P < s(a/i + /?/ 2 ) = cxPsifi) + /^P<s(/ 2 )，whenever 
f 2 G T~L and a and f3 are scalars. 

It will be useful to observe the following. Suppose {ek} is a (finite 
or infinite) collection of orthonormal vectors in 7i. Then the orthogonal 
projection P in the closure of the subspace spanned by {ek} is given by 
P(f) = ^2 k (f,ek)ek. In case the collection is infinite, the sum converges 
in the norm of Ti. 

We illustrate this with two examples that arise in Fourier analysis. 

Example 1. On L 2 ([—7r, 7r]), recall that if /(0) 〜 Yl^L-oo a n eine then 
the partial sums of the Fourier series are 

N 

Sn_ = a ， ine _ 

n=—N 
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Therefore, the partial sum operator Sn consists of the projection onto 
the closed subspace spanned by {e—jv ， … ， ejv}. 

The sum Sn can be realized as a convolution 



where D^{0) = sin((AT + 1/2)9) / sin(0/2) is the Dirichlet kernel. 

Example 2. Once again, consider L 2 ([— 7 r, tt]) and let S denote the 
subspace that consists of all F G L 2 ([— 7 r, tt]) with 


F {0) 〜 


In other words, S is the space of square integrable functions whose 
Fourier coefficients a n vanish for n < 0. From the proof of Fatou’s theo¬ 
rem, this implies that S can be identified with the Hardy space 丑 2 (D), 
where D is the unit disc, and so is a closed subspace unitarily isomorphic 
to £ 2 (Z + ). Therefore, using this identification, if P denotes the orthogo¬ 
nal projection from L 2 ([— 7 r, 7 r]) to 5, we may also write P(/)(z) for the 
element corresponding to iJ 2 (D), that is, 


p(/) ㈤ = 


Given / G L 2 ([— 7 r, 7 r]), we define the Cauchy integral of / by 



where 7 denotes the unit circle and 2 ： belongs to the unit disc. Then we 
have the identity 


P(f)(z) = C(f)(z), for all 2 ： G D. 


Indeed, since f e L 2 it follows by the Cauchy-Schwarz inequality that 
/ E L 1 ([— 7 r, 7 r]), and therefore we may interchange the sum and integral 
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in the following calculation (recall 1 2 ：| < 1): 

P(fXz ) 二 = f ； (丟厂 f(e ie )e- ine d0\z' 

n=0 n=0 ^ ^ h ) 


1 r 

2n / 

f(e id ) ^2{e- ie z) n dd 


n=0 

1 厂 

納 de 


1 — e~ l6 z 

丄厂 

八广 ie ie d0 

2 vr "-„ 

.e te - z 

C(f)(z) 



5 Linear transformations 

The focus of analysis in Hilbert spaces is largely the study of their lin¬ 
ear transformations. We have already encountered two classes of such 
transformations, the unitary mappings and the orthogonal projections. 
There are two other important classes we shall deal with in this chapter 
in some detail: the “linear functionals” and the “compact operators,” 
and in particular those that are symmetric. 

Suppose T~L\ and W 2 are two Hilbert spaces. A mapping T : 7i± —> TL 2 

is a linear transformation (also called linear operator or operator) 
if 


T(af + bg) = aT(f) + bT(g) for all scalars a, b and Hi. 

Clearly, linear operators satisfy T(0) = 0. 

We shall say that a linear operator T : Hi H 2 is bounded if there 
exists M > 0 so that 

⑹ \\T(f)\\ H2 < M\\f\\ ni . 

The norm of T is denoted by or simply ||T|| and defined by 

where the infimum is taken over all M so that (6) holds. A trivial example 
is given by the identity operator /, with 7(/) = /. It is of course a 
unitary operator and a projection, with \\I\\ = 1. 
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In what follows we shall generally drop the subscripts attached to the 
norms of elements of a Hilbert space, when this causes no confusion. 

Lemma 5.1 ||T|| = sup{|(T/, 5 )| : ||/|| < 1, I^H < 1}, where of course 
f G Hi and g G TL). 

Proof. If ||T|| < M, the Cauchy-Schwarz inequality gives 

\{Tf,g)\ < M whenever ||/|| < 1 and ||g|| < 1; 

thus sup{|(T/, 5 )| : ||/| < 1, \\g\\ < 1} < ||T|, 

Conversely, if sup{|(T/,g)| : ||/|| < 1, ||g|| < 1} < M, we claim that 
||T/|| < M||/|| for all /. If / or Tf is zero, there is nothing to prove. 
Otherwise, f — //||/|| and g' — T//||T/|| have norm 1, so by assump- 
tion 

\(Tf',g')\<M. 

But since \(Tf',g')\ m ||T/||/||/|| this gives \\Tf\\ < M\\f\\, and the 
lemma is proved. 

A linear transformation T is continuous if T (/ n ) —>• T(f) whenever 
f n —> /. Clearly, linearity implies that T is continuous on all of l~Li if 
and only if it is continuous at the origin. In fact, the conditions of being 
bounded or continuous are equivalent. 


Proposition 5.2 A linear operator T : l~i\ 1~L2 is bounded if and only 
if it is continuous. 

Proof. If T is bounded, then ||T(/) - T (/ n )|| Wa < M\\f - fj ni , 
hence T is continuous. Conversely, suppose that T is continuous but 
not bounded. Then for each n there exists f n ^0 such that ||T (/ n )|| > 
n||/ n ||. The element g n = f n /{ri\\f n \\) has norm 1/n, hence g n 0. 
Since T is continuous at 0, we must have T{g n ) 0, which contradicts 
the fact that ||T(^ n )|| > 1. This proves the proposition. 

In the rest of this chapter we shall assume that all linear operators are 
bounded, hence continuous. It is noteworthy to recall that any linear 
operator between finite-dimensional Hilbert spaces is necessarily contin¬ 
uous. 

5.1 Linear functionals and the Riesz representation theorem 

A linear functional ^ is a linear transformation from a Hilbert space 
TL to the underlying field of scalars, which we may assume to be the 
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complex numbers, 


£ ： n^c. 

Of course, we view C as a Hilbert space equipped with its standard norm, 
the absolute value. 

A natural example of a linear functional is provided by the inner prod¬ 
uct on H. Indeed, for fixed g 6 H, the map 

is linear, and also bounded by the Cauchy-Schwarz inequality. Indeed, 
1(/,5)1 < M\\f\\, where M = ||g||. 

Moreover, £(g) = M||^|| so we have ||^|| = ||^||. The remarkable fact is 
that this example is exhaustive, in the sense that every continuous linear 
functional on a Hilbert space arises as an inner product. This is the so- 
called Riesz representation theorem. 

Theorem 5.3 Let £ be a continuous linear functional on a Hilbert space 
Ti. Then, there exists a unique g E such that 

忍 (f) = (/, g) for all fen. 


Moreover, \\£\\ 


Proof. Consider the subspace of H defined by 


s = {fen ： e(f) - o}. 


Since £ is continuous the subspace <S, which is called the null-space of 
is closed. If S = then £ = 0 and we take ^ = 0. Otherwise S 1 - is non¬ 
trivial and we may pick any h G S- 1 with \\h\\ = 1. With this choice of h 
we determine g by setting g = £(h)h. Thus if we let u = i(f)h — £(h) f, 
then u 三 S, and therefore (u, h) = 0. Hence 

o= _h- mf,h )=_)• 

Since (/i, h) = 1, we find that £(f) = (f,g) as desired. 

At this stage we record the following remark for later use. Let Ho 
be a pre-Hilbert space whose completion is 7i. Suppose is a linear 
functional on Hq which is bounded, that is, |<o(/)| 幺州 I/ll f° r all / G 
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TLq. Then 仑 o has an extension ^ to a bounded linear functional on 7i : 
with \£(f)\ < M||/|| for all f d This extension is also unique. To see 
this, one merely notes that { 《 o(/n)} is a Cauchy sequence whenever the 
vectors {/ n } belong to 7Yo, and f n — f in H, as n —^ oo. Thus we may 
define £(f) as lim n _，oo The verification of the asserted properties 

of £ is then immediate. (This result is a special case of the extension 
Lemma 1.3 in the next chapter.) 

5.2 Adjoints 

The first application of the Riesz representation theorem is to determine 
the existence of the “adjoint” of a linear transformation. 

Proposition 5.4 Let T .. 7i — Ti be a bounded linear transformation. 
There exists a unique bounded linear transformation T* on TL so that: 

(i) (Tf,g) = (f,T*g), 

(ii) imi = ||r*||, 

(iii) (T*)* = T. 

The linear operator T* : H ^ H satisfying the above conditions is called 

the adjoint of T. 

To prove the existence of an operator satisfying (i) above, we observe 
that for each fixed g CH, the linear functional t = t g , defined by 

Kf) = (Tf,g), 

is bounded. Indeed, since T is bounded one has ||T/|| < M||/||; hence 
the Cauchy-Schwarz inequality implies that 

\m < \\Tf\\ \\g\\ < B\\f\\, 

where B = M||^||. Consequently, the Riesz representation theorem guar¬ 
antees the existence of a unique /i G 7Y, h = h g , such that 

Then we define T*g = h, and note that the association T* : g h is 
linear and satisfies (i). 

The fact that ||T|| = ||T*|| follows at once from (i) and Lemma 5.1: 

\\ T \i =sup{|(T/,fir)| : |：]/|：| < 1, || 5 || < 1} 

= sup{|(/,T* 5 )| : ll/ll < 1, \\g\\ < 1} 

Him. 
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To prove (iii), note that (Tf ， g) = (/, T*g) for all / and g if and only 
if (T* f ， g) = (/, Tg) for all / and 仏 as one can see by taking complex 
conjugates and reversing the roles of / and g. 

We record here a few additional remarks. 

(a) In the special case when T = T* (we say that T is symmetric), then 

⑺ ||T||-sup{|(T/,/)|: ll/ll = 1}_ 

This should be compared to Lemma 5.1, which holds for any linear oper¬ 
ator. To establish (7), let M = sup{|(T/,/)| : ||/|| = 1}. By Lemma 5.1 
it is clear that M < ||T||. Conversely, if / and g belong on 7Y, then one 
has the following “polarization” identity which is easy to verify 

{Tf ， g) =-[(T(f -\- g), f -\- g) — (T{f — g),f — g) 

+ i {T(f f + ig)-i (T(f - ig)J - ig)]. 

For any h e H, the quantity (Th,h) is real, because T = T*, hence 
(T/i, h) = (h,T*h) = (h ， Th) = (T/i, h). Consequently 

Re(T/， d) = ^ [(T(/ 9)^ f 9) — (T[f - g) ， f - g)} . 

Now \{Th,h)\ < M\\h\\ 2 , so \Re{Tf,g)\ < ^ [||/+ "|| 2 + ||/- 夕 || 2 ] , and 
an application of the parallelogram law (4) then implies 

|Re(T/,g)|<^[||/|| 2 + || ff || 2 ]. 

So if ll/ll < 1 and ||^|| < 1, then \Re(Tf, g)\ < M. In general, we may 
replace g by e l0 g in the last inequality to find that whenever ||/|| < 1 and 
II^H < 1, then \(Tf,g)\ < M, and invoking Lemma 5.1 once again gives 
the result, ||T|| < M. 

(b) Let us note that if T and S are bounded linear transformations of 7i to 
itself, then so is their product T5, defined by (TS)(f) = T(S(f)). More¬ 
over we have automatically (TS)* = 5*T*; in fact, (TSf^g) = (5/, T*g)= 

(/m). 

(c) One can also exhibit a natural connection between linear transforma¬ 
tions on a Hilbert space and their associated bilinear forms. Suppose first 
that T is a bounded operator in 1~L. Define the corresponding bilinear 
form B by 


⑻ 


B(f ， g) 二 （ Tf ， g). 
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Note that B is linear in / and conjugate linear in g. Also by the Cauchy- 
Schwarz inequality \B(f^g)\ < M||/|| ||^||, where M = ||T||. Conversely if 
B is linear in /, conjugate linear in g and satisfies \B(f,g)\ < M||/|| ||^||, 
there is a unique linear transformation so that (8) holds with M = ||T||. 
This can be proved by the argument of Proposition 5.4; the details are 
left to the reader. 

5.3 Examples 

Having presented the elementary facts about Hilbert spaces, we now 
digress to describe briefly the background of some of the early develop¬ 
ments of the theory. A motivating problem of considerable interest was 
that of the study of the “eigenfunction expansion” of a differential oper¬ 
ator L. A particular case, that of a Sturm-Liouville operator, arises on 
an interval [a, b] of M with L defined by 



where g is a given real-valued function. The question is then that of 
expanding an “arbitrary” function in terms of the eigenfunctions (f, that 
is those functions that satisfy L(ip )= 評 for some /i G M. The classi¬ 
cal example of this is that of Fourier series, where L = d 2 /dx 2 on the 
interval [—7r, tt] with each exponential e inx an eigenfunction of L with 
eigenvalue \i = —n 2 . 

When made precise in the “regular” case, the problem for L can be 
resolved by considering an associated “integral operator 55 T defined on 


L 2 ([a,b\) by 



with the property that for suitable /, 

LT(f) = f. 


It turns out that a key feature that makes the study of T tractable is 
a certain compactness it enjoys. We now pass to the definitions and 
elaboration of some of these ideas, and begin by giving two relevant 
illustrations of classes of operators on Hilbert spaces. 

Infinite diagonal matrix 

Suppose {^Pk)kLi is an orthonormal basis of Then, a linear transfor¬ 
mation T : 7Y —>• 7Y is said to be diagonalized with respect to the basis 
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Wk} if 

T((fk) = where G C for all fc. 

In general, a non-zero element ip is called an eigenvector of T with 
eigenvalue A if T(f = Xip. So the (fk above are eigenvectors of T, and 
the numbers Xk are the corresponding eigenvalues. 

So if 

oo oo 

/ ~ ^ a k ip k then T/ ~ ^ a k X k (f k . 
k=l k=l 

The sequence {A^} is called the multiplier sequence corresponding to 
T. 

In this case, one can easily verify the following facts: 

• ||T|| =sup fc |A fc |. 

• T* corresponds to the sequence {Afc}; hence T = T* if and only if 
the Afc are real. 

• T is unitary if and only if |A^| = 1 for all k. 

• T is an orthogonal projection if and only if = 0 or 1 for all k. 

As a particular example, consider 7i = L 2 ([—7r,7r]), and assume that 
every / G I/ 2 ([—7r,7r]) is extended to ]R by periodicity, so that f(x + 
27r) = f(x) for all x G M. Let (fk(x) = e lkx for A: G Z. For a fixed /i G M 
the operator Uh defined by 

Uh(f){x) = f(x + h) 

is unitary with = e lkh . Hence 

oo oo 

UhU)- J2 a ^e ikx if 卜 J2 akeikx - 

k=—oo k=—oo 

Integral operators, and in particular, Hilbert-Schmidt 
operators 

Let 7i = L 2 (R d ). If we can define an operator T .. H — TL by the formula 

T{f){x) = [ K(x,y)f(y) dy whenever / e L 2 (R d ), 

JR d 
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we say that the operator T is an integral operator and K is its asso¬ 
ciated kernel. 

In fact, it was the problem of invertibility related to such operators, 
and more precisely the question of solvability of the equation f — T f = g 
for given that initiated the study of Hilbert spaces. These equations 
were then called “integral equations.” 

In general a bounded linear transformation cannot be expressed as an 
(absolutely convergent) integral operator. However, there is an inter¬ 
esting class for which this is possible and which has a number of other 
worthwhile properties: Hilbert-Schmidt operators, those with a ker¬ 
nel K that belongs to L 2 (R d x ]R d ). 

Proposition 5.5 Let T be a Hilbert-Schmidt operator on L 2 (R d ) with 
kernel K. 

(i) If f G L 2 (M. d ), then for almost every x the function y i— > K[x ， y) f (y) 
is integrable. 

(ii) The operator T is bounded from L 2 (R d ) to itself, and 

ll^ll ^ I 叫 k 2 (R d xR d )， 

where ||i^||L 2 (R d xR d ) L 2 -norm of K on M. d x M. d = R 2d . 

(iii) The adjoint T* has kernel K(y, x). 

Proof. By Fubini’s theorem we know that for almost every x, the 
function y i— \K{x^ y)\ 2 is integrable. Then, part (i) follows directly from 
an application of the Cauchy-Schwarz inequality. 

For (ii), we make use again of the Cauchy-Schwarz inequality as follows 



Therefore, squaring this and integrating in x yields 


\\Tf\\ 2 L 2 {Rd) ^ J (^J \ K ( x ,y、\ 2d y j \f(y)\ 2 dy^j 



Finally, part (iii) follows by writing out (Tf : g) in terms of a double 
integral, and then interchanging the order of integration, as is permissible 
by Fubini’s theorem. 
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Hilbert-Schmidt operators can be defined analogously for the Hilbert 
space L 2 (E ) ， where 五 is a measurable subset of R d . We leave it to the 
reader to formulate an prove the analogue of Proposition 5.5 that holds 
in this case. 

Hilbert-Schmidt operators enjoy another important property: they are 
compact. We will now discuss this feature in more detail. 

6 Compact operators 

We shall use the notion of sequential compactness in a Hilbert space H: 
a set X C 7Y is compact if for every sequence {/ n } in X, there exists a 
subsequence {/ nfc } that converges in the norm to an element in X. 

Let H denote a Hilbert space, and B the closed unit ball in 7Y, 

B = {/ €H: ll/ll <1}. 


A well-known result in elementary real analysis says that in a finite¬ 
dimensional Euclidean space, a closed and bounded set is compact. How¬ 
ever, this does not carry over to the infinite-dimensional case. The fact 
is that in this case the unit ball, while closed and bounded, is not com¬ 
pact. To see this, consider the sequence {/ n } = {e n }, where the e n are 
orthonormal. By the Pythagorean theorem, \\e n — e m || 2 = 2 if n 笋 m, so 
no subsequence of the {e n } can converge. 

In the infinite-dimensional case we say that a linear operator T : 7Y —^ 
TL is compact if the closure of 

T(B) = { g eH:g = T(/) for some f e B) 

is a compact set. Equivalently, an operator T is compact if, whenever 
{fk} is a bounded sequence in Ji, there exists a subsequence {/ nfe } so 
that T/ nfc converges. Note that a compact operator is automatically 
bounded. 

Note that by what has been said, a linear transformation is in general 
not compact (take for instance the identity operator!). However, if T is 
of finite rank, which means that its range is finite-dimensional, then 
it is automatically compact. It turns out that dealing with compact 
operators provides us with the closest analogy to the usual theorems of 
(finite-dimensional) linear algebra. Some relevant analytic properties of 
compact operators are given by the proposition below. 

Proposition 6.1 Suppose T is a bounded linear operator on TL. 
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(i) If S is compact on TL, then ST and TS are also compact. 

(ii) If {T n } is a family of compact linear operators with \\T n — T|| —^ 0 
as n tends to infinity, then T is compact. 

(iii) Conversely, if T is compact, there is a sequence {T n } of operators 
of finite rank such that \\T n — T\\ —>• 0. 

(iv) T is compact if and only if T* is compact. 

Proof. Part (i) is immediate. For part (ii) we use a diagonalization 
argument. Suppose {fk} is a bounded sequence in H. Since T\ is com¬ 
pact, we may extract a subsequence {/i,fc}fcLi of {fk} such that 
converges. From {fi : k} we may find a subsequence {f 2 ,k}kLi such that 
T 2 (f 2 ,k) converges, and so on. If we let = fk’k, then we claim (T(^)} 
is a Cauchy sequence. We have 

m 9k ) - T(^)|| < \\T(g k ) - T m (g k )\\ + ||T m ( 你）一 T m (^)|| + 

+ ll^m(^) — T(gi)\\. 

Since \\T — T m || ^ 0 and {g^} is bounded, we can make the first and 
last term each < e/3 for some large m independent of k and £. With this 
fixed m, we note that by construction ||T m (^) — T m (^)|| < e/3 for all 
large k and £. This proves our claim; hence {T ( 你 ） } converges in Ti. 

To prove (iii) let {ek}^ =1 be a basis of H and let Q n be the orthogonal 
projection on the subspace spanned by the with k > n. Then clearly 
Q n (g) 〜 J2k>n a k e k whenever g ~ J2kLi and ||Q n ^|| 2 is a decreas¬ 
ing sequence that tends to 0 as n —> oo for any g d We claim that 
||Q n T|| —> 0 as n —> oo. If not, there is a c > 0 so that ||Q n ^|| > c, and 
hence for each n we can find / n , with \\f n \\ = 1 so that ||Q n ^/n|| ^ c. 
Now by compactness of T, choosing an appropriate subsequence {/ nfe }, 
we have T f nk —> g for some g. But Q Uk (d) = Qn k Tfn k — Qn k (T f nk — g), 
and hence we conclude that ||Q nfc (p)|| 仝 c/2, for large k. This contradic¬ 
tion shows that ||Q n T|| —^ 0. So if P n is the complementary projection 
on the finite-dimensional space spanned by ei,..., e n , I = P n + Q n , then 
||Q n T|| —> 0 means that ||P n T — T\\ —> 0. Since each P n T is of finite rank, 
assertion (iii) is established. 

Finally, if T is compact the fact that ||P n T — T\\ —^ 0 implies ||T*P n — 
T* II —> 0, and clearly T*P n is again of finite rank. Thus we need only 
appeal to the second conclusion to prove the last. 


We now state two further observations about compact operators. 
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• If T can be diagonalized with respect to some basis {^fk} of eigen¬ 
vectors and corresponding eigenvalues {A^}, then T is compact if 
and only if |A^| ^ 0. See Exercise 25. 

• Every Hilbert-Schmidt operator is compact. 

To prove the second point, recall that a Hilbert-Schmidt operator is 
given on L 2 (R d ) by 

T(/)(x) = j K{x^y)f{y) dy^ where K G L 2 (R d x R d ). 

JR d 

If {^pk}kLi denotes an orthonormal basis for L 2 (M d ), then the collection 
is an orthonormal basis for L 2 (M d x M d ); the proof of 
this simple fact is outlined in Exercise 7. As a result 

oo 

K{x,y) ~ ^2 a ke^k(x)ipe(y), with J2 k ,e |a^| 2 < oo- 
k,£=l 

We define an operator 

Tnf(x) = / K n (x, y)f(y)dy, where K n (x,y) = Y，k £=i a M^k{x)^pi{y). 
jR d 

Then, each T n has finite-dimensional range, hence is compact. Moreover, 
ll-^" — -^n|lL 2 (M d xR d ) = 〉: \ a ke\ 2 0 as n —> oo. 

fc > n or £ n 

By Proposition 5.5, ||T — T n \\ < \\K — i^n||L 2 (M d xM d )? so we can conclude 
the proof that T is compact by appealing to Proposition 6.1. 

The climax of our efforts regarding compact operators is the infinite¬ 
dimensional version of the familiar diagonalization theorem in linear al¬ 
gebra for symmetric matrices. Using a similar terminology, we say that 
a bounded linear operator T is symmetric if T* = T. (These operators 
are also called “self-adjoint” or “Hermitian .”） 

Theorem 6.2 (Spectral theorem) Suppose T is a compact symmet¬ 
ric operator on a Hilbert space 7i. Then there exists an (orthonormal) 
basis {^Pk}kLi of H that consists of eigenvectors of T• Moreover, if 


then Afc G M and Afc —^ 0 as k ^ oo. 
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Conversely, every operator of the above form is compact and symmetric. 
The collection {Afc} is called the spectrum of T. 


Lemma 6.3 Suppose T is a bounded symmetric linear operator on a 
Hilbert space Ji. 

(i) If X is an eigenvalue of T, then X is real. 

(ii) If fi and are eigenvectors corresponding to two distinct eigen¬ 
values, then fi and are orthogonal. 

Proof. To prove (i), we first choose a non-zero eigenvector / such 
that T(f) = A/. Since T is symmetric (that is, T = T*), we find that 

A(/, /) = (Tf, f) = (/, Tf) ^ (/, A/) = A(/, /), 

where we have used in the last equality the fact that the inner product is 
conjugate linear in the second variable. Since / 7 ^ 0, we must have A = A 
and hence A G M. 

For (ii), suppose fi and have eigenvalues Ai and A 2 , respectively. 
By the previous argument both A 1 and 入 2 are real, and we note that 


Al(/l,/ 2 )-(Ai/i,/ 2 ) 
二 (Tfuh) 
= (fuTf 2 ) 
= (/ 1 ，入 2 / 2 ) 
— 入 2(/l ， /2) - 


Since by assumption Ai — A 2 we must have f 2 ) = 0 as desired. 

For the next lemma note that every non-zero element of the null-space 
of T — XI is an eigenvector with eigenvalue A. 

Lemma 6.4 Suppose T is compact, and A 7 ^ 0. Then the dimension of 
the null space of T — XI is finite. Moreover, the eigenvalues of T form 
at most a denumerable set Ai,..., Afc,.. ■，with ^ 0 as k ^ 00 . More 
specifically, for each " > 0 ， the linear space spanned by the eigenvectors 
corresponding to the eigenvalues Xk with \Xk\ > /^ is finite-dimensional. 

Proof. Let Va denote the null-space of T — A/, that is, the eigenspace 
of T corresponding to A. If Va is not finite-dimensional, there exists 
a countable sequence of orthonormal vectors {^fk} i n Since T is 
compact, there exists a subsequence {(^ nfc } such that T((^ nfe ) converges. 
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But since T((p nk ) = Xip Uk and A 乂 0, we conclude that ip Uk converges, 
which is a contradiction since \\^p nk — ^fn k , || 2 = 2 if fc 7 ^ k r . 

The rest of the lemma follows if we can show that for each p > 0, there 
are only finitely many eigenvalues whose absolute values are greater than 
fi. We argue again by contradiction. Suppose there are infinitely many 
distinct eigenvalues whose absolute values are greater than //, and let 
{<^k} be a corresponding sequence of eigenvectors. Since the eigenvalues 
are distinct, we know from the previous lemma that { 仰 } is orthogonal, 
and after normalization, we may assume that this set of eigenvectors is 
orthonormal. One again, since T is compact, we may find a subsequence 
so that T(cp n/ J converges, and since 

T(^Pn k ) = 入 

the fact that |A nfc | > " leads to a contradiction, since {^} is an or¬ 
thonormal set and thus ||A nfc ^ nfe - A nj .^ nj || 2 = A^ fc + X^. > 2[i 2 . 

Lemma 6.5 Suppose T ★ 0 is compact and symmetric. Then either ||T|| 
or —||T|| is an eigenvalue of T. 

Proof. By the observation (7) made earlier, either 

||T|Hsup{(T/,/): ll/Hl} or -||T||=inf{(T/,/): ||/|| = 1}. 

We assume the first case, that is, 

A=||T||= S up{(T/,/): 11/1 = 1}, 

and prove that A is an eigenvalue of T. (The proof of the other case is 
similar.) 

We pick a sequence {f n } C H such that ||/ n || = 1 and (T/ n , f n ) A. 
Since T is compact, we may assume also (by passing to a subsequence of 
{f n } if necessary) that {Tf n } converges to a limit g EH. We claim that 
g is an eigenvector of T with eigenvalue A. To see this, we first observe 
that Tf n — Xf n —>• 0 because 

\\Tfn - Xfnf - \\Tfn\\ 2 ~ 2 A(T/ n ，/ „) + \ 2 \\f n \\ 2 

< ||T|| 2 ||/ n || 2 -2A(T/ n ，/ n ) + A 2 ||/ n || 2 

< 2 入 2 - 2X(Tf n , f n ) 0. 

Since T/ n —^ we must have Xf n g, and since T is continuous, this 
implies that XTf n Tg. This proves that \g = Tg. Finally, we must 


7. Exercises 


193 


have ^ 7 ^ 0, for otherwise ||T n / n || ^ 0, hence (T/ n , f n ) —>• 0, and A = 
||T|| = 0, which is a contradiction. 

We are now equipped with the necessary tools to prove the spectral 
theorem. Let S denote the closure of the linear space spanned by all 
eigenvectors of T. By Lemma 6.5, the space S is non-empty. The goal 
is to prove that S = 7i. If not, then since 

(9) <5 ㊉5 丄 =W ， 

51 would be non-empty. We will have reached a contradiction once 
we show that 5 丄 contains an eigenvector of T. First, we note that T 
respects the decomposition (9). In other words, if / G 5 then Tf G 5, 
which follows from the definitions. Also, if ^ G S 1 - then Tg G S^~. This 
is because T is symmetric and maps S to itself, and hence 

(Tg,f) = (g,Tf) = 0 whenever 5 e <S 丄 and / e 5. 

Now consider the operator T\, which by definition is the restriction of 
T to the subspace S^. The closed subspace 5^ inherits its Hilbert space 
structure from H. We see immediately that T\ is also a compact and 
symmetric operator on this Hilbert space. Moreover, if 丄 is non-empty, 
the lemma implies that T\ has a non-zero eigenvector in S^. This eigen¬ 
vector is clearly also an eigenvector of T, and therefore a contradiction 
is obtained. This concludes the proof of the spectral theorem. 

Some comments about Theorem 6.2 are in order. If in its statement we 
drop either of the two assumptions (the compactness or symmetry of T), 
then T may have no eigenvectors. (See Exercises 32 and 33.) However, 
when T is a general bounded linear transformation which is symmetric, 
there is an appropriate extension of the spectral theorem that holds for 
it. Its formulation and proof require further ideas that are deferred to 
Chapter 6. 

7 Exercises 

1. Show that properties (i) and (ii) in the definition of a Hilbert space (Section 2) 
imply property (iii): the Cauchy-Schwarz inequality |(/, ^)| < ||/|| • ||p|| and the 
triangle inequality \\f g\\ < ||/|| + ||^||. 

[Hint: For the first inequality, consider (/ + 入仏 / + 入分 ） as a positive quadratic 
function of A. For the second, write ||/ + g\\ 2 as (/ + 仏 / + g)] 


2. In the case of equality in the Cauchy-Schwarz inequality we have the following. 
If |(/, ^)| = ll/ll || 分 || and g ^ 0, then f = eg for some scalar c. 
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[Hint: Assume ||/|| = \\g\\ = 1 and (f,g) = 1. Then f — g and g are orthogonal, 
while f = f -g-\-g. Thus ||/|| 2 = \\f - g\\ 2 \\g\\ 2 .] 

3. Note that 11/ + g\\ 2 = ||/|| 2 + ||p|| 2 + 2Re(/,p) for any pair of elements in a 
Hilbert space 九 . As a result, verify the identity ||/ + g\\ 2 + ||/ — g\\ 2 = 2(||/|| 2 + 



4. Prove from the definition that £ 2 (Z) is complete and separable. 

5. Establish the following relations between L 2 (R d ) and L 1 (R d ): 

(a) Neither the inclusion L 2 (R d ) C L 1 (R d ) nor the inclusion L 1 (R d ) C L 2 {R d ) 


is valid. 


(b) Note, however, that if / is supported on a set E of finite measure and if / G 
L 2 (R d ), applying the Cauchy-Schwarz inequality to f\E gives / G L 1 (R d ), 
and 


ll/IL 1 ^) $ m ( 五 ) 1 / 2 ||/IL 2 (R d ). 

(c) If / is bounded (|/(x)| < M), and / G then / G L 2 (R d ) with 


WfWm^) < m 1/2 ||/||^ 2 


L 1 ^)' 


[Hint: For (a) consider f(x) = |x| _a , when \x\ < 1 or when |a:| > 1.] 

6. Prove that the following are dense subspaces of L 2 (R d ). 

(a) The simple functions. 

(b) The continuous functions of compact support. 

7. Suppose {ipk}kLi is an orthonormal basis for L 2 (R d ). Prove that the collection 

j}i<k,j<oo with (fk,j(x,y) = Lpk(x)<^>j{y) is an orthonormal basis of L 2 (R d x 



[Hint: First verify that the {(fk,j} are orthonormal, by Fubini’s theorem. Next, 
for each j consider Fj(x) = / Rd F(x,y)(fj(y) dy. If one assumes that (F, (fk,j) = 0 
for all j, then f Fj{x)^pk{x) dx = 0.] 

8. Let rj(t) be a fixed strictly positive continuous function on [a, b]. Define Ti v = 
L 2 ([a, b ], rj) to be the space of all measurable functions / on [a, b] such that 



Define the inner product on 7i v by 
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(a) Show that Ti-q is a Hilbert space, and that the mapping U : / i—> r] 1 ^ 2 f gives 
a unitary correspondence between Tt” and the usual space L 2 ([a, b]). 

(b) Generalize this to the case when 77 is not necessarily continuous. 

9. Let 7ii = L 2 ([— 7 r, 7 r]) be the Hilbert space of functions F{e x6 ) on the unit circle 
with inner product (F,G) = 士 F{e xd )G{e ie ) d 6 . Let H，2 be the space L 2 (R). 
Using the mapping 

i — x 

x 1 -^ -- 

i-\- x 

of R to the unit circle, show that: 

(a) The correspondence U .. F — f, with 

⑽ = 闊 

gives a unitary mapping of l~Ci to H 2 . 

(b) As a result, 



is an orthonormal basis of L 2 (R). 


10. Let S denote a subspace of a Hilbert space Ti. Prove that is the 

smallest closed subspace of Ti that contains S. 

11. Let P be the orthogonal projection associated with a closed subspace <S in a 
Hilbert space 7i, that is, 

P(f) = fiifeS and P(f) = 0 if / G 丄 . 

(a) Show that P 2 = P and P* = P. 

(b) Conversely, if P is any bounded operator satisfying P 2 = P and P* = P, 
prove that P is the orthogonal projection for some closed subspace of Ti. 

(c) Using P, prove that if is a closed subspace of a separable Hilbert space, 
then S is also a separable Hilbert space. 


12. Let ^ be a measurable subset of R d , and suppose S is the subspace of L 2 (R d ) 
of functions that vanish for a.e. x ^ E. Show that the orthogonal projection P on 
S is given by P(f) = xe • f, where \e is the characteristic function of E. 








196 


Chapter 4. HILBERT SPACES: AN INTRODUCTION 


13. Suppose Pi and P 2 are a pair of orthogonal projections on Si and S 2 , respec¬ 
tively. Then P 1 P 2 is an orthogonal projection if and only if Pi and P 2 commute, 
that is, P1P2 = P 2 .P 1 - In this case, P1P2 projects onto 5^ fl 


14. Suppose Ti and Ti' are two completions of a pre-Hilbert space Tio, Show that 
there is a unitary mapping from Ti to Ti' that is the identity on TLq. 


[Hint: If / G 7^, pick a Cauchy sequence {/ n } in 1~Cq that converges to / in Ti. This 
sequence will also converge to an element j' in Ti'. The mapping / i—>• f gives the 
required unitary mapping.] 


15. Let T be any linear transformation from Ti\ to H 2 . If we suppose that Ti\ is 
finite-dimensional, then T is automatically bounded. (If 7^i is not assumed to be 
finite-dimensional this may fail; see Problem 1 below.) 

16. Let F 0 (z) = 1/(1-z)\ 

(a) Verify that |-Fo(^)| ^ e 71 "〆 2 in the unit disc, but that lim r —i Fo(r) does not 
exist. 

[Hint: Note that |_Fo(r)| = 1 and Fo(r) oscillates between 士 1 infinitely often 
as r — 1.] 

(b) Let {an}^Li be an enumeration of the rationals, and let 

00 

F( Z ) = j2 si M^- ia n, 

j=i 

where 5 is sufficiently small. Show that lim r _,i F{re xe ) fails to exist when¬ 
ever 0 = aj, and hence F fails to have a radial limit for a dense set of points 
on the unit circle. 


17. Fatou’s theorem can be generalized by allowing a point to approach the 
boundary in larger regions, as follows. 

For each 0 < s < 1 and point 2 ; on the unit circle, consider the region T s (z) 
defined as the smallest closed convex set that contains 2 ： and the closed disc D s (0). 
In other words, r s ( 2 ：) consists of all lines joining 2 with points in D s (0). Near the 
point 2 ：, the region r s ( 2 ：) looks like a triangle. See Figure 2. 

We say that a function F defined in the open unit disc has a non-tangential 
limit at a point 2 : on the circle, if for every 0 < 5 < 1, the limit 


lim F(w) 

w g r s (z) 


exists. 

Prove that if F is holomorphic and bounded on the open unit disc, then F has 
a non-tangential limit for almost every point on the unit circle. 

[Hint: Show that the Poisson integral of a function / has non-tangential limits at 
every point of the Lebesgue set of /.] 
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18. Let Ti denote a Hilbert space, and C{TL) the vector space of all bounded linear 

operators on Ti. Given T G we define the operator norm 

||T|| = inf{B : ||T^|| < B\\v\\, for all H}. 

(a) Show that || + T2II < ||Ti|| + || 乃 || whenever Ti,T2 G C{Ti). 

(b) Prove that 

defines a metric on C(TL). 

(c) Show that C{TL) is complete in the metric d. 

19. If T is a bounded linear operator on a Hilbert space, prove that 

||TT*|| = ||T*T|| = ||T|| 2 = ||T*|| 2 . 


20. Suppose Ti is an infinite-dimensional Hilbert space. We have seen an example 
of a sequence {/ n } in Ti with ||/ n || = 1 for all n, but for which no subsequence 
of {fn} converges in Ti. However, show that for any sequence {/ n } in Ti with 
||/ n || = 1 for all n, there exist / G and a subsequence {/ nfc } such that for all 
g ETC, one has 


lim (fn k ,g) = {f,g)- 

K ― >00 


One says that {/ nfc } converges weakly to /. 

[Hint: Let g run through a basis for Ti, and use a diagonalization argument. One 
can then define / by giving its series expansion with respect to the chosen basis.] 
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21 . There are several senses in which a sequence of bounded operators {T n } can 
converge to a bounded operator T (in a Hilbert space Ti). First, there is con¬ 
vergence in the norm, that is, \\T n — T\\ —>• 0, as n ^ oo. Next, there is a weaker 
convergence, which happens to be called strong convergence, that requires that 
T n f Tf, as n —>• oo, for every vector / G 7i. Finally, there is weak conver¬ 
gence (see also Exercise 20) that requires (T n f,g) —> (T/, g) for every pair of 
vectors f ， gd~C. 

(a) Show by examples that weak convergence does not imply strong convergence, 
nor does strong convergence imply convergence in the norm. 

(b) Show that for any bounded operator T there is a sequence {T n } of bounded 
operators of finite rank so that T n — T strongly as n —>■ oo. 


22. An operator T is an isometry if ||T/|| = ||/|| for all f £7i. 

(a) Show that if T is an isometry, then (Tf, Tg) = (/, g) for every f,g£7i. 
Prove as a result that T*T = I. 

(b) If T is an isometry and T is surjective, then T is unitary and TT* = I. 

(c) Give an example of an isometry that is not unitary. 

(d) Show that if T*T is unitary then T is an isometry. 

[Hint: Use the fact that (Tf, Tf) = (/, /) for / replaced by / 士 p and / 士 ip.] 

23. Suppose {Tfc} is a collection of bounded operators on a Hilbert space Ti, with 
||Tfc|| < 1 for all k. Suppose also that 

T k T* = T^Tj = 0 for all k ^ j. 

Let Sn = J2k=-N Tk - 

Show that Sn(J) converges as N —> oo, for every f £7i. If T(f) denotes the 
limit, prove that ||T|| < 1. 

A generalization is given in Problem 8* below. 

[Hint: Consider first the case when only finitely many of the Tk are non-zero, and 
note that the ranges of the Tk are mutually orthogonal.] 


24. Let {efcjfcL! denote an orthonormal set in a Hilbert space Ti. If {cfc}^! is a 
sequence of positive real numbers such that ^ < oo, then the set 

oo 

A = {y^ a/befc : \a k \ < c k } 

fc=i 


is compact in Ti. 

25. Suppose T is a bounded operator that is diagonal with respect to a basis {^fk}, 
with T(pk = 入 fcWfc. Then T is compact if and only if Afc —> 0. 
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[Hint: If 入 fc — 0, then note that ||P n T — T\\ —> 0, where P n is the orthogonal 
projection on the subspace spanned by fi, # 2 , ... ， fn.] 

26. Suppose w is a, measurable function on with 0 < w(x) < oo for a.e. a:, and 
K is a measurable function on IR 2d that satisfies: 

(i) / \K(x,y)\w(y) dy < Aw{x) for almost every x G R d , and 

jR d 

(ii) / \K(x, y)\w(x) dx < Aw(y) for almost every y G R d . 

jR d 

Prove that the integral operator defined by 

T f( x ) = [ KO ， y)/(y) 办 ， x eR d 

JR d 

is bounded on L 2 (R d ) with ||T|| < A. 

Note as a special case that if J \ K(x,y) \ dy < A for all x, and J \ K(x,y)\dx < A 
for all y, then ||T|| < A. 

[Hint: Show that if / G L 2 (M d ), then 

J l^(^,2/)l \f(y)\dy < A 1/2 w(x) 1/2 

27. Prove that the operator 

Tf(x) = - f°° 孕 dy 

7T Jo x-\-y 

is bounded on L 2 (0, oo) with norm ||T|| < 1. 

[Hint: Use Exercise 26 with an appropriate w.\ 

28. Suppose 7i = L 2 (B), where B is the unit ball in Let K(x, y) be a mea¬ 
surable function on B x B that satisfies \K(x,y)\ < A\x — y\~ d+OL for some a > 0, 
whenever x,y G B. Define 

Tf(x)= f K(x,y)f(y)dy. 

J B 

(a) Prove that T is a bounded operator on Ti. 

(b) Prove that T is compact. 

(c) Note that T is a Hilbert-Schmidt operator if and only if a > d/2. 

[Hint: For (b), consider the operators T n associated with the truncated kernels 
K n (x, y) = K(x, y) if \x — y\ > 1/n and 0 otherwise. Show that each T n is compact, 
and that \\T n — T|| — 0 as n — oo.] 

29. Let T be a compact operator on a Hilbert space Ti, and assume A ^ 0. 


r 丄 " 

J I 吨 ,2/)1 l/(y)l 2 —y) _1 办 .] 
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(a) Show that the range of AJ — T defined by 


{g G •• g = (XI — T)f, for some / G H} 


is closed. [Hint: Suppose gj —^ g, where gj = [XI _ T)fj. Let V\ denote 
the eigenspace of T corresponding to A, that is, the kernel of XI — T. Why 
can one assume that fj G 丄？ Under this assumption prove that {fj} is a 
bounded sequence.] 

(b) Show by example that this may fail when 入 = 0. 

(c) Show that the range of XI — T is all of Ti if and only if the null-space of 
XI — T* is trivial. 


30. Let Ti = L 2 ([—7r,7r]) with [—7r, tv] identified as the unit circle. Fix a bounded 
sequence {A n }S=-oo of complex numbers, and define an operator Tf by 


Tf(x) 〜 ^2 ^nCL n e nx whenever /(x ) 〜 a n e nx 


Such an operator is called a Fourier multiplier operator, and the sequence 
{A n } is called the multiplier sequence. 

(a) Show that T is a bounded operator on Ti and ||T|| = sup n |/\ n |. 

(b) Verify that T commutes with translations, that is, if we define Th(x)= 
f(x — h) then 


T o Th = Th oT for every /i G M. 


(c) Conversely, prove that if T is any bounded operator on Ti that commutes 
with translations, then T is a Fourier multiplier operator. [Hint: Consider 


T{e inx )] 


31. Consider a version of the sawtooth function defined on [—7r, 7r) by 5 

K{x) = i(sgn(x)n — x), 


and extended to R with period 2n. Suppose / G L 1 ([—7r, 7r]) is extended to R with 
period 2 丌 ， and define 



5 The symbol sgn(x) denotes the sign function: it equals 1 or —1 if a: is positive or 
negative respectively, and 0 if x = 0. 
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(a) Show that F(x) = Tf{pc) is absolutely continuous, and if /:冗 f(y)dy = 0, 
then F' {x) = if(x) a.e. x. 

(b) Show that the mapping f Tf is compact and symmetric on L 2 ([—7r, 7r]). 

(c) Prove that ^p(x) G n]) is an eigenfunction for T if and only if <^(x) 

is (up to a constant multiple) equal to e xnx for some integer n _ 0 with 
eigenvalue 1/n, or (p(x) = 1 with eigenvalue 0. 

(d) Show as a result that {e inx } n ez is an orthonormal basis of L 2 ([—7r, 7 t]). 


Note that in Book I, Chapter 2, Exercise 8, it is shown that the Fourier series 
of X is 


刪 〜 E 

n^O 



32. Consider the operator T : L 2 ([0,1]) — L 2 ([0,1]) defined by 

(a) Prove that T is a bounded linear operator with T = T*, but that T is not 
compact. 

(b) However, show that T has no eigenvectors. 

33. Let 7^ be a Hilbert space with basis Verify that the operator T 

defined by 

T{ifk) = i Wfc+i 


is compact, but has no eigenvectors. 

34. Let X be a Hilbert-Schmidt kernel which is real and symmetric. Then, as we 
saw, the operator T whose kernel is K is compact and symmetric. Let {^fk(x)} be 
the eigenvectors (with eigenvalues Afc) that diagonalize T. Then: 

(a) Yjk |Afc| 2 < oo. 

(b) K(x, y) ^ 入 k(pk(x)(fk(y) is the expansion of K in the basis {^Pk{x)(pk{y)} • 

(c) Suppose T is a compact operator which is symmetric. Then T is of Hilbert- 
Schmidt type if and only if |A n | 2 < oo, where {An} are the eigenvalues 
of T counted according to their multiplicities. 


35. Let be a Hilbert space. Prove the following variants of the spectral theorem. 
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(a) If Ti and T 2 are two linear symmetric and compact operators on Ti that 
commute (that is, T 1 TI 2 = T 2 T 1 ), show that they can be diagonalized simul¬ 
taneously. In other words, there exists an orthonormal basis for Ti which 
consists of eigenvectors for both Ti and T 2 . 

(b) A linear operator on Ti is normal if TT* = T*T. Prove that if T is normal 
and compact, then T can be diagonalized. 

[Hint: Write T = Ti iT^ where T\ and T 2 are symmetric, compact and 
commute.] 

(c) If U is unitary, and U = XI — T, where T is compact, then U can be diago¬ 
nalized. 


8 Problems 

1. Let Ti be an infinite-dimensional Hilbert space. There exists a linear functional 
i defined on Ti that is not bounded (and hence not continuous). 

[Hint: Using the axiom of choice (or one of its equivalent forms), construct an 
algebraic basis of 7i, {eo：}; it has the property that every element of Ti is uniquely 
a finite linear combination of the {eo：}. Select a denumerable collection {e n }?^=i, 
and define £ to satisfy the requirement that £(e n ) = n||e n || for all n G N.] 

2* The following is an example of a non-separable Hilbert space. We consider 
the collection of exponentials {e zXx } on R, where A ranges over the real numbers. 
Let 7io denote the space of finite linear combinations of these exponentials. For 
f,g £ TCq, we define the inner product as 

(/, a) = lim ^ [ f{x)g(x)dx. 

T^oo Z1 J _ T 

(a) Show that this limit exists, and 

N 

if, a) = 

k=l 

if f(x) = Ef=i a Xk e iX ^ and g(x) = Ef =1 b Xk e iX ^. 

(b) With this inner product Tio is a pre-Hilbert space. Notice that ||/|| < 
sup^ |/(x)|, if f G Ho, where II/|| denotes the norm (/, Z) 1 ^ 2 - Let Ti be 
the completion of Tio. Then Ti is not separable because e xXx and e xX x are 
orthonormal if A _ A’. 

A continuous function F defined on M is called almost periodic if it is the 
uniform limit (on R) of elements in Tio. Such functions can be identified 
with (certain) elements in the completion Ti\ We have TLq C AP C Ti, where 
AP denotes the almost periodic functions. 
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(c) A continuous function F is in AP if for every e > 0 we can find a length 
L = L e such that any interval / C M of length L contains an “almost period” 
r satisfying 


sup \F(x + r) — F(x)\ < e. 


(d) An equivalent characterization is that F is in AP if and only if every se¬ 
quence F{x h n ) of translates of F contains a subsequence that converges 
uniformly. 

3. The following is a direct generalization of Fatou’s theorem: if u{re z6 ) is harmonic 
in the unit disc and bounded there, then lim r _,i u(re x0 ) exists for a.e. 6. 

[Hint: Let a n (r) = ^ / Q 27r u(re ie )e~ in9 dO. Then <(r) + 争 a“(r) — ^a n (r) = 0, 
hence a n (r) = A n r n + B n r -n , n ^ 0, and as a result 6 u(re xe ) = a n r^e xn9 . 

From this one can proceed as in the proof of Theorem 3.3.] 

4. * This problem provides some examples of functions that fail to have radial limits 
almost everywhere. 

(a) At almost every point of the boundary unit circle, the function z 2n 

fails to have a radial limit. 

(b) More generally, suppose F(z) — a n z 2n • Then, if ^ |a n | 2 = oo the 

function F fails to have radial limits at almost every boundary point. How¬ 
ever, if ^ \a n \ 2 < oo, then F G H 2 (JD)), and we know by the proof of Theo¬ 
rem 3.3 that F does have radial limits almost everywhere. 

5. * Suppose F is holomorphic in the unit disc, and 



where log + u = log u if u> 1, and log"*" u = 0 if u < 1. 
Then linv-a F(re l °) exists for almost every 0. 

The above condition is satisfied whenever (say) 




(since e pu > pu, u> 0). 

Functions that satisfy the latter condition are said to belong to the Hardy 
space H p (D). 

6 .* If T is compact, and A ^ 0, show that 


6 See also Section 5, Chapter 2 in Book I. 
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(a) XI — T is injective if and only if XI — T* is injective. 

(b) 入 J — T is injective if and only if AJ — T 1 is surjective. 

This result, known as the Fredholm alternative, is often combined with that in 
Exercise 29. 


7. Show that the identity operator on L 2 (R d ) cannot be given as an (absolutely) 
convergent integral operator. More precisely, if K(x, y) is a measurable function 
on IR d x R d with the property that for each / G L 2 (R d ), the integral T(f)(x)= 
f Rd K(x,y)f(y)dy converges for almost every x, then T(/) # / for some /. 

[Hint: Prove that otherwise for any pair of disjoint balls B± and B 2 in we 
would have that K(x, y) = 0 for a.e. (x, y) G .Bi x B 2 .] 

8 . * Suppose {Tfc} is a collection of bounded opeartors on a Hilbert space TL. As¬ 
sume that 

\\TkT*\\ < a k -j and ||^^-|| < a k _^ 

for positive constants {a n } with the property that = A < 00 . Then 

Sn(J) converges as N — 00 , for every / G 7i, with Sn = X^-at ^ - Moreover, 
T = limiv-^oo Sn satisfies ||T|| < A. 


9. A discussion of a class of regular Sturm-Liouville operators follows. Other 
special examples are given in the problems below. 

Suppose [a, b] is a bounded interval, and L is defined on functions / that are 
twice continuously differentiable in [a, b] (we write, / G C 2 ([a, 6 ])) by 


Here the function q is continuous and real-valued on [a, b], and we assume for 
simplicity that q is non-negative. We say that (p G C 2 ([a, b]) is an eigenfunction 
of L with eigenvalue /x if = [up, under the assumption that (p satisfies the 
boundary conditions ^(a) = ip(b) = 0. Then one can show: 

(a) The eigenvalues are strictly negative, and the eigenspace corresponding 
to each eigenvalue is one-dimensional. 

(b) Eigenvectors corresponding to distinct eigenvalues are orthogonal in L 2 ([a, b]). 


(c) Let K[x,y) be the “Green’s kernel” defined as follows. Choose to be 

a solution of L(cp_) = 0, with (p~(a) = 0 but (p-(a) 7 ^ 0. Similarly, choose 
to be a solution of L(y?+) = 0 with <^+(b) = 0, but 7 ^ 0. Let 

w = (p’+(x)cp-[x) — (x), be the “Wronskian” of these solutions, and 

note that w is a non-zero constant. 

Set 


K(x,y)= 


w 

ip + (x)ip-(y) 


a < x < y < b, 
\i a < y < x <b. 
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Then the operator T defined by 

T (f)( x )= [ K (^^y)f(y)dy 

J a 

is a Hilbert-Schmidt operator, and hence compact. It is also symmetric. 
Moreover, whenever / is continuous on [a, b ], T/ is of class C 2 ([a, b]) and 

L(Tf) = f. 

(d) As a result, each eigenvector of T (with eigenvalue A) is an eigenvector of L 
(with eigenvalue fi = 1/A). Hence Theorem 6.2 proves the completeness of 
the orthonormal set arising from normalizing the eigenvectors of L. 


10.* Let L be defined on C 2 ([—1,1]) by 

L(f)(x) = (l-x 2 )^-2x^. 
If (p n is the n th Legendre polynomial, given by 




A 

dx, 


(1 -x) n , n = 0,l,2,... ; 


then Lifn = —n(n + l)(p n - 

When normalized the (p n form an orthonormal basis of L 2 ([—1,1]) (see also 
Problem 2, Chapter 3 in Book I, where (p n is denoted by L n .) 

11.* The Hermite functions hk(x) are defined by the generating identity 




-(x 2 /2—2tx-\-t 2 ) 


k\ 


(a) They satisfy the “creation” and “annihilation” identities (x — - £：) hk{x)= 

hk+i(x) and (x + 石 ) hk(x) = hk-i{x) for /c > 0 where h_i(x) = 0. Note 
that ho(x) = e~ x / 2 , h\{x) = 2xe~ x / 2 , and more generally hk(x)= 

Pk{x)e~ x / 2 , where Pk is a polynomial of degree k. 

(b) Using (a) one sees that the hk are eigenvectors of the operator L = —d 2 /dx 2 + 
x 2 , with L(hk) = Afc/ifc, where Xk = 2k 1. One observes that these func¬ 
tions are mutually orthogonal. Since 



n 1/2 2 k k\ 


=Cfc, 


we can normalize them obtaining a orthonormal sequence {Hk}, with Hk = 
c: 1 2 hk. This sequence is complete in L 2 (R d ) since f R fHk dx = 0 for all k 

implies f (x)e~ ^ +2tx dx = 0 for all t G C. 
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(c) Suppose that K(x, y) = Hk( ' x ^ k ^, and also F(x) = T(f)(x)= 

f R K(x, y)f(y) dy. Then T is a symmetric Hilbert-Schmidt operator, and 
迁 / 〜 Er=o 办迅 ， then F 〜 E^o 筇讯 • 

One can show on the basis of (a) and (b) that whenever / G L 2 (R), not only is 
F G L 2 (R), but also x 2 F(x) G L 2 (IR). Moreover, F can be corrected on a set of 
measure zero, so it is continuously differentiable, F r is absolutely continuous, and 
F" G L 2 (R). Finally, the operator T is the inverse of L in the sense that 

LT(f) = LF = -F" + x 2 F = f for every / G L 2 (R). 

(See also Problem 7* in Chapter 5 of Book I.) 




Hilbert Spaces: Several 
Examples 


What is the difference between a mathematician and 
a physicist? It is this: To a mathematician all Hilbert 
spaces are the same; for a physicist, however, it is their 
different realizations that really matter. 

Attributed to E. Wigner, ca. 1960 


Hilbert spaces arise in a large number of different contexts in analysis. 
Although it is a truism that all (infinite-dimensional) Hilbert spaces are 
the same, it is in fact their varied and distinct realizations and separate 
applications that make them of such interest in mathematics. We shall 
illustrate this via several examples. 

To begin with, we consider the Plancherel formula and the resulting 
unitary character of the Fourier transform. The relevance of these ideas 
to complex analysis is then highlighted by the study of holomorphic func¬ 
tions in a half-space that belong to the Hardy space H 2 . That function 
space itself is another interesting realization of a Hilbert space. The con¬ 
siderations here are analogous to the ideas that led us to Fatou’s theorem 
for the unit disc, but are of a more involved character. 

We next see how complex analysis and the Fourier transform com¬ 
bine to guarantee the existence of solutions to linear partial differential 
equations with constant coefficients. The proof relies on a basic L 2 es¬ 
timate, which once established can be exploited by simple Hilbert space 
techniques. 

Our final example is Dirichlefs principle and its applications to the 
boundary value problem for harmonic functions. Here the Hilbert space 
that arises is given by Dirichlefs integral, and the solution is expressed 
by aid of an appropriate orthogonal projection operator. 

1 The Fourier transform on L 2 

The Fourier transform of a function / on is defined by 

⑴ KO - [ f(x)e~^ ix <dx, 

jR d 
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and its attached inversion is given by 

(2) / ㈤ 二 / /(Oe 2 一处 . 

JR d 

These formulas have already appeared in several different contexts. 
We considered first (in Book I) the properties of the Fourier transform 
in the elementary setting by restricting to functions in the Schwartz class 
S(R d ). The class S consists of functions / that are smooth (indefinitely 
differentiable) and such that for each multi-index a and /?, the function 
x a ( 蠢 )$/ is bounded on M^. 1 We saw that on this class the Fourier trans¬ 
form is a bijection, that the inversion formula (2) holds, and moreover 
we have the Plancherel identity 

(3) [ |/(0| 2 ^= [ \f(x)\ 2 dx. 

JR d JR d 

Turning now to more general (in particular, non-continuous) functions, 
we note that the largest class for which the integral defining / ⑹ con¬ 
verges (absolutely) is the space L 1 (M d ). For it, we saw in Chapter 2 that 
a (relative) inversion formula is valid. 

Beyond these particular facts, what we would like here is to reestablish 
in the general context the symmetry between / and / that holds for S. 
This is where the special role of the Hilbert space L 2 (R d ) enters. 

We shall define the Fourier transform on L 2 (R d ) as an extension of its 
definition on S. For this purpose, we temporarily adopt the notational 
device of denoting by To and T the Fourier transform on S and its 
extension to L 2 , respectively. 

The main results we prove are the following. 

Theorem 1.1 The Fourier transform J-q, initially defined on S(R d ), 
has a (unique) extension T to a unitary mapping of L 2 (R d ) to itself. In 
particular, 

ll,(/)"L2(Rd) = II/IIl 2 (M^) 

for all f e L 2 (R d ). 

The extension T will be given by a limiting process: if {/ n } is a sequence 
in the Schwartz space that converges to / in L 2 (M d ), then {^o(/n)} will 


1 Recall that x a = x^x^ 2 - - - x^ d and ( 嘉）口 = (^§^) 知 … where a = 

(ai,..., ad) and (3 = (/?i,..., /3d), with Oij and /3j positive integers. The order of a is 
denoted by |a| and defined to be qi + • • • + q ：^. 
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converge to an element in L 2 (R d ) which we will define as the Fourier 
transform of /. To implement this approach we have to see that every L 2 
function can be approximated by elements in the Schwartz space. 

Lemma 1.2 The space S(M, d ) is dense in L 2 (R d ). In other words, given 
any f G L 2 (M d )，there exists a sequence {/ n } C 5(M d ) such that 

11/ _ /n|U 2 (M d ) — 0 as n ^ oo. 

For the proof of the lemma, we fix / G L 2 (R d ) and e > 0. Then, for 
each M > 0, we define 


9m{x) = I 


if |x| < M and |/(x)| < M, 
otherwise. 


Then, \f(x) - g M (x)\ < 2\f(x)\, hence \f(x) - g M (x)\ 2 < 4|/(x)| 2 , and 
since ^m(^) —>• /(^) as M —> oo for almost every x, the dominated con¬ 
vergence theorem guarantees that for some M, we have 


11 / — 9M\\L 2 (R d ) < 


We write g = note that this function is bounded and supported on 
a bounded set, and observe that it now suffices to approximate g by 
functions in the Schwartz space. To achieve this goal, we use a method 
called regularization, which consists of “smoothing” g by convolving it 
with an approximation of the identity. Consider a function (p(x) on 
with the following properties: 


(a) (p is smooth (indefinitely differentiable). 


(b) (p is supported in the unit ball. 

(c) (f>0. 

(d) / (f(x) dx = 1. 

JR d 

For instance, one can take 


A x )= 


ce 1 - \ x \ 2 

0 


if \x\ < 1, 

if 1^1 > 1, 


where the constant c is chosen so that (d) holds. 

Next, we consider the approximation to the identity defined by 


Ks(x) = 5~ d (f(x/5). 
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The key observation is that g * Ks belongs to <S(M d ), with this convolu¬ 
tion in fact bounded and supported on a fixed bounded set, uniformly in 
5 (assuming for example that 5 < 1). Indeed, we may write 

(9* I<s)(x) = J g(y)K s {x - y) dy 二 jg(x - y)K s (y)dy, 

in view of the identity (6) in Chapter 2. We note that since g is supported 
on some bounded set and Ks vanishes outside the ball of radius 5, the 
function g * Ks is supported in some fixed bounded set independent of 5. 
Also, the function g is bounded by construction, hence 

\(g*K s )(x)\< J \g{x - y)\K s (y)dy 

< sup \g(z)\ / K s (y)dy = sup \g(z)\, 

zeR d J zeR d 

which shows that g * Ks is also uniformly bounded in S. Moreover, from 
the first integral expression for g * Ks above, one may differentiate under 
the integral sign to see that g * K§ is smooth and all of its derivatives 
have support in some fixed bounded set. 

The proof of the lemma will be complete if we can show that g * Ks 
converges to g in L 2 (M d ). Now Theorem 2.1 in Chapter 3 guarantees 
that for almost every x, the quantity \(g * Ks)(x) — g{x)\ 2 converges to 0 
as 8 tends to 0. An application of the bounded convergence theorem 
(Theorem 1.4 in Chapter 2) yields 

11(^ * K 心 — 々 11 ^ 2 (]^) —^ 0 as 5 — > 0. 

In particular, ||(^ * Ks) — 5i|_L 2 (R d ) < e for an appropriate 6 and hence 
11/ — ^ * Ks\\L 2 (R d ) < 2e, and choosing a sequence of e tending to zero 
gives the construction of the desired sequence {/ n }. 

For later purposes it is useful to observe that the proof of the above 
lemma establishes the following assertion: if / belongs to both L 1 (R rf ) 
and L 2 (M d ), then there is a sequence {/ n }, f n G that converges 

to / in both the L^^-norm and the L 2 -norm. 

Our definition of the Fourier transform on L 2 (R d ) combines the above 
density of S with a general “extension principle.” 

Lemma 1.3 Let Hi and H 2 denote Hilbert spaces with norms || • ||i and 
II • II 2 , respectively. Suppose S is a dense subspace of Hi and Tq : 5 —> H 2 
a linear transformation that satisfies || 了 0 (/)|| 2 幺 c ll/l|i whenever / G <S. 
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Then To extends to a (unique) linear transformation T : Hi —> H 2 that 
satisfies ||T(/)|| 2 < c||/||i for all f G Hi. 

Proof. Given / G 7Yi, let {/ n } be a sequence in S that converges to /, 
and define 

T(f) = lim T 0 (/ n ), 

n —^00 

where the limit is taken in Ti). To see that T is well-defined we must 
verify that the limit exists, and that it is independent of the sequence 
{f n } used to approximate /. Indeed, for the first point, we note that 
(T(/ n )} is a Cauchy sequence in H 2 because by construction {/ n } is 
Cauchy in ?Yi, and the inequality verified by To yields 

\\To{fn) - T 0 (fm)h < c\\fn - / m ||i — 0 as n,m ^ oo; 

thus (To(/ n )} is Cauchy, hence converges in W 2 . 

Second, to justify that the limit is independent of the approximating 
sequence, let {gn} be another sequence in S that converges to / in Hi. 
Then 


\\To(fn)-To(9n)h<c\\f n -g n \\ u 

and since II/ n — ^ n ||i < \\fn - f\\i + \\f - ^ n ||i, we conclude that {T 0 (g n )} 
converges to a limit in H 2 that equals the limit of (To(/ n )}. 

Finally, we recall that if / n — / and T 0 (/ n ) — T(/), then ||/ n ||i ^ 
||/||i and ||Tb (/ n )||2 —• ||T(/)|| 2 , so in the limit as n —»• 00 , the inequality 
||^(/)||2 < c||/||i holds for all / G Wi. 

In the present case of the Fourier transform, we apply this lemma with 
Hi = H 2 = L 2 (R d ) (equipped with the L 2 -norm), S = and To = 

J-q the Fourier transform defined on the Schwartz space. The Fourier 
transform on L 2 (R d ) is by definition the unique (bounded) extension of 
To to L 2 guaranteed by Lemma 1.3. Thus if / G L 2 (R d ) and {/ n } is any 
sequence in 5(M d ) that converges to / (that is, \\f — f n \\L 2 (R d ) —^ 0 as 
n —»• 00 ), we define the Fourier transform of / by 

⑷ 7(/) = lim MU )， 

n—00 

where the limit is taken in the L 2 sense. Clearly, the argument in the 
proof of the lemma shows that in our special case the extension T con¬ 
tinues to satisfy the identity (3): 

11^(/)11^^) = ||/||L 2 (Rd) whenever / € L 2 (R d ). 
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The fact that T is invertible on L 2 (and thus J 7 is a unitary mapping) 
is also a consequence of the analogous property on S(R d ). Recall that 
on the Schwartz space, is given by formula (2), that is, 

/ g(0e 2nix< dC, 

JR d 

and satisfies again the identity ||^~ 1 (^)|| L 2 = ||^|| L 2 . Therefore, arguing 
in the same fashion as above, we can extend to L 2 (R d ) by a limiting 
argument. Then, given / G L 2 (R d ), we choose a sequence {/ n } in the 
Schwartz space so that \\f — /nlU 2 ~^ 0. We have 

fn 二 - 

and taking the limit as n tends to infinity, we see that 

/二尸只⑺二好-、/), 

and hence T is invertible. This concludes the proof of Theorem 1.1. 
Some remarks are in order. 

(i) Suppose / belongs to both L 1 (M d ) and L 2 (R d ). Are the two definitions 
of the Fourier transform the same? That is, do we have J-'(f) = /, with 

defined by the limiting process in Theorem 1.1 and / defined by the 
convergent integral (1)? To prove that this is indeed the case we recall 
that we can approximate / by a sequence {/ n } in S so that / n —> / both 
in the I^-norm and the L 2 -norm. Since Toifn) = /n, a passage to the 
limit gives the desired conclusion. In fact, Toifn) converges to T(f) in 
the i 2 -norm, so a subsequence converges to ^(f) almost everywhere; see 
the analogous statement for L 1 in Corollary 2.3, Chapter 2. Moreover, 

sup \fn(0 - f{0\ <\\fn- /|ki(R ， 

hence f n converges to / everywhere, and the assertion is established. 

(ii) The theorem gives a rather abstract definition of the Fourier trans¬ 
form on L 2 . In view of what we have just said, we can also define the 
Fourier transform more concretely as follows. If / G L 2 (R d ), then 

/(0= lim f f(x)e~ 2 - ix <dx, 

R —°° J\x\<R 

where the limit is taken in the L 2 -norm. Note in fact that if xr denotes 
the characteristic function of the ball {x G : \x\ < R}, then for each 
R the function /%丑 is in both L 1 and L 2 , and fxR —^ / in the L 2 -norm. 
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(iii) The identity of the various definitions of the Fourier transform dis¬ 
cussed above allows us to choose / as the preferred notation for the 
Fourier transform. We adopt this practice in what follows. 

2 The Hardy space of the upper half-plane 

We will apply the L 2 * theory of the Fourier transform to holomorphic 
functions in the upper half-plane. This leads us to consider the relevant 
analogues of the Hardy space and Fatou’s theorem discussed in the previ¬ 
ous chapter. 2 It incidentally provides an answer to the following natural 
question: What are the functions / G L 2 (R) whose Fourier transforms 
are supported on the half-line (0, oo)? 

Let = {z = x iy^ x G M, y > 0} be the upper half-plane. We 
define the Hardy space to consist of all functions F analytic 

in with the property that 

(5) sup / \F(x + iy)\ 2 dx < oo. 
y>0 Jr 

We define the corresponding norm, ||F||^ 2 ( R p, to be the square root of 
the quantity (5). 

Let us first describe a (typical) example of a function F in 丑 2 (M^_). 
We start with a function Fo that belongs to L 2 (0, oo), and write 

/»oo 

(6) F(x + iy ) 二 I P 0 (X)e 2ir ^ z di, z = x + iy,y>0. 

Jo 

(The choice of the particular notation Fq will become clearer below.) 
We claim that for any 5 > 0 the integral (6) converges absolutely and 
uniformly as long as y > 5. Indeed, \F 0 (^)e 27Vl ^ z \ = |F 0 (^)|e _27r ^, hence 
by the Cauchy-Schwarz inequality 

( /»oo \ l/ 2 / /*oo 

乂 I 鳥⑹叫（乂 e - 4 ，e 

from which the asserted convergence is established. From the uniform 
convergence it follows that F(z) is holomorphic in the upper half-plane. 
Moreover, by PlancherePs theorem 

/ \F(x + iy)\ 2 dx^ r\F 0 (0\ 2 e-^d^<\\F 0 \\ 2 LH0jOo) , 

Jr Jo 




2 Further motivation and some elementary background material may be found in The¬ 

orem 3.5 in Chapter 4 of Book II. 
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and in fact, by the monotone convergence theorem, 

sup [ |F(x + iy)| 2 dx= || 爲||! 2( 。 )• 

v>oJr 

In particular, F belongs to The main result we prove next is 

the converse, that is, every element of the space is in fact of the 

form (6). 

Theorem 2.1 The elements F in iJ 2 (R?_) are exactly the functions 
given by (6), with Fo G L 2 (0,oo). Moreover 

in 2 叫） =ii-pbiU 2 (o,oo)- 

This shows incidentally that i7 2 (IRl) is a Hilbert space that is isomorphic 
to L 2 (0, oo) via the correspondence (6). 

The crucial point in the proof of the theorem is the following fact. For 
any fixed strictly positive y, we let F y (^) denote the Fourier transform 
of the L 2 function F(x + iy), x G M. Then for any pair of choices of y, 
yi and 犯 we have that 

(7) F yi (Oe 2 ^ - F V2 for a.e. ^ 

To establish this assertion we rely on a useful technical observation. 

Lemma 2.2 If F belongs to then F is bounded in any proper 

half-plane {z = x -iy : y > <5} ; where (5 > 0. 

To prove this we exploit the mean-value property of holomorphic func¬ 
tions. This property may be stated in two alternative ways. First, in 
terms of averages over circles, 

1 Z * 2 丌 

(8) F(C) = — / F(C + re 16 ) d6 if 0 < r < 5. 

2 丌九 

(Note that if C lies in the upper half-plane, Im(C) > 5, then the disc 
centered at ( of radius r belongs to M^_.) Alternatively, integrating over 
r, we have the mean-value property in terms of discs, 

⑼ F(C) = —7^ [ F(C + z)dxdy, z = x + iy. 

^ J\z\<5 

These assertions actually hold for harmonic functions in M 2 (see Corol¬ 
lary 7.2, Chapter 3 in Book II for the result about holomorphic functions, 
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and Lemma 2.8, Chapter 5 in Book I for the case of harmonic functions); 
later in this chapter we in fact prove the extension of (9) to R d . 

From (9) we see from the Cauchy-Schwarz inequality that 

inC)| 2 <^ [ \F(C + z)\ 2 dxdy. 

丌占 2 J\z\<S 

Writing z = x + iy and C = € + 乂 7 ?, with 77 > 5, we see that the disc 
Bs(C) °f center ( and radius 5 is contained in the strip { 2 ： + ^ : z = 
x + iy, —5 < y < <5}, and moreover this strip lies in the half-plane 
See Figure 1 . 



This gives the following majorization: 

[\F(C + z)\ 2 dx dy < j f \F(( + x + iy)\ 2 dx dy 
J\z \<<5 J | 2 /|<<5 JM. 

< 25 sup / \F(x + i(r] + y))\ 2 dx. 

-S<y<6 JR 

Recalling that rj > 5, we see that the last expression is in fact majorized 
by 

25sup [ \F(x + iy)\ 2 dx = 26 ||^||^ 2 (R 2 
y>o Jr v 


In all |.F(C)| 2 ^ 盖 IIH 2 in the half-plane Im(C) > 0, which proves the 
lemma. 


We now turn to the proof of the identity (7). Starting with F in 
i/ 2 (]R^_), we improve it by replacing it with the function F e defined by 


F e (z) = F(z) 


(1 — iezy 


with 6 > 0 . 


Observe that \F e (z)\ < \F(z)\ when Im( 2 ：) > 0; also F e (z) F(z) for 
each such 2 :, as e ^ 0. This shows that for each y > 0, F e (x + iy) —>• 
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F{x + iy) in the L 2 -norm. Moreover, the lemma guarantees that each 
F e satisfies the decay estimate 

F e (z) = O ( i j 2 ) whenever Im(:) > 5, for some 5 > 0. 

We assert first that (7) holds with F replaced by F e . This is a simple 
consequence of contour integration applied to the function 

G(z) = F e (z)e _ 2lTiz 、 


In fact we integrate G(z) over the rectangle with vertices —R + iyi, R + 
iyi^ R iy 2 , —R~\~iy 2 , and let R —> oo. If we take into account that 
G(z) = 0(1/(1 + x 2 )) in this rectangle, then we find that 




G(z) dz, 


where Lj is the line {x + iyj : x G M}, j = 1,2. Since 


G(z)dz= F e {x + i yj )e- 2 ^ x+iy ^dx, 


This means that 


Since F e (x + iyj) —^ F(x + iyj) in the L 2 -norm as e —> 0, we then ob¬ 
tain (7). 

The identity we have just proved states that F y (^)e 27ry ^ is independent 
ofy, y > 0, and thus there is a function Fo(^) so that F y (^)e 27T ^ y = Fo(^); 
as a result 


F y (0 - F 0 {Oe~ 2 ^ y for all y > 0. 


Therefore by PlancherePs identity 



\F(x + iy)\ 2 dx = 



I 烏⑹ | 2 e— 4 _ 处 , 


and hence 


sup J 內⑹ | 2 e _ 4 吻成二 < oo. 
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Finally this in turn implies that Fo(0 = 0 for almost every ^ G (—oo, 0). 
For if this were not the case, then for appropriate positive numbers a, 6, 
and c we could have that |Fo(C)l ^ ^ for ^ in a set E in (—oo, —6), with 
m(E) > c. This would give f |Fo(^)| 2 e _47r ^ > a 2 ce 4?rby , which grows 

indefinitely as y oo. The contradiction thus obtained shows that Fo(^) 
vanishes almost everywhere when ^ G (—oo, 0). 

To summarize, for each y > 0 the function F y (^) equals F 0 (^)e~ 27r ^ y , 
with Fq G L 2 (0, oo). The Fourier inversion formula then yields the repre¬ 
sentation (6) for an arbitrary element of H 2 ， and the proof of the theorem 
is concluded. 

The second result we deal with may be viewed as the half-plane ana¬ 
logue of Fatou’s theorem in the previous chapter. 

Theorem 2.3 Suppose F belongs to Then lim ^—,0 Fix + iy) = 

Fo(x) exists in the following two senses: 

(i) ^45 a limit in the L 2 (M)-norm. 

(ii) ^45 a limit for almost every x. 

Thus F has boundary values (denoted by Fo) in either of the two senses 
above. The function Fo is sometimes referred to as the boundary-value 
function of /. The proof of (i) is immediate from what we already know. 
Indeed, if Fo is the L 2 function whose Fourier transform is Fo^ then 

/»oo 

\\F(x + iy)-F 0 (x)\\ 2 L2m ^ / \F 0 (0\ 2 \e- 2 ^ y -l\ 2 dy, 

Jo 

and this tends to zero as ?/ —> 0 by the dominated convergence theorem. 

To prove the almost everywhere convergence, we establish the Poisson 
integral represent at ion 

(10) [ / ⑹ e -加咖 / f(x-t)Vy(t)dt, 

JR JR 

with 

i y 

7T y 2 -\- X 2 

the Poisson kernel. 3 This identity holds for every (x,y) G Ml and any 
function / in L 2 (M). To see this, we begin by noting the following ele¬ 
mentary integration formulas: 

/»oo • 

(11) / e 2 ^ z d^-^- if lm{z) > 0, 

Jo 2 丌之 



3 This is the analogue in R of the identity (3) for the circle, given in Chapter 4. 
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and 

( 12 ) [ e ~^\y e ^^ = - 0 V 0 iiy>0. 

Jr tv y 2 + x 2 

The first is an immediate consequence of the fact that 

e 2 戍 z 吡二 _ 

2iriz 

if we let N —>■ oo. To prove the second formula, we write the integral as 




e -2^y e 2^x ^ + 



e -2^y e -2niix 


which equals 


2tt 


x iy —x + iy 


1 y 

7T y 2 x 2 


by (ll). 

Next we establish (10) when / belongs to (say) the space S. Indeed, for 
fixed (x, y) G consider the function = /(t)e _27r ^e _27r ^l 2/ e 27r2 ^ 

on M 2 = {(€，’)}. Since |$(^,^)| = \f{t)\e~ 2n ^ y : then (because / is rapidly 
decreasing) $ is integrable over M 2 . Applying Fubini’s theorem yields 


[([ 邮， 

Jr \Jr • 



处 . 


The right-hand side obviously gives f R f^e~ 27r ^^ y e 27rlx ^ d^, while the 
left-hand side yields f(t)V y {x — y) dt in view of (12) above. However, 
if we use the relation (6) in Chapter 2 we see that 



f(t)V y {x -y)dt = 



f(x - t)V y (t) dt. 


Thus the Poisson integral representation (10) holds for every f G S. For 
a general / G L 2 (M) we consider a sequence {/ n } of elements in <S, so 
that /n —• / (and also / n —> /) in the L 2 -norm. A passage to the limit 
then yields the formula for / from the corresponding formula for each 
f n . Indeed, by the Cauchy-Schwarz inequality we have 



m - Uo] 


e ~27r\^\y e 27TixC 




^ 11/ - /n"L 2 
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and also 


\ 1/2 

\r y (t)\ 2 dtj ， 

and the right-hand sides tend to 0 because for each fixed (x,y) G IR^_ the 
functions e— 2?r l ❿， ^ G M, and V y (t ) ， t G M, belong to L 2 (M). 


[/(x -t)- f n (x - t)\P y {t) dt 


^ 11/ " /n|U 2 


Having established the Poisson integral representation (10), we return 
to our given element F G We know that there is an L 2 function 

Fo(^) (which vanishes when ^ < 0) such that (6) holds. With Fq the 
L 2 (M) function whose Fourier transform is Fo(0? we see from (10), with 
f = Fq, that 


F(x -\-iy) = / F 0 (x - t)V y {t) dt. 
Jr 


From this we deduce the fact that F(x + iy) —> Fq(x) a.e in x as y ^ 0, 
since the family {V y } is an approximation of the identity for which The¬ 
orem 2.1 in Chapter 3 applies. There is, however, one small obstacle that 
has to be overcome: the theorem as stated applied to L 1 functions and 
not to functions in L 2 . Nevertheless, given the nature of the approxima¬ 
tion to the identity, a simple “localization” argument will succeed. We 
proceed as follows. 

It will suffice to see that for any large N, which is fixed, F(x + iy) —>• 
F 0 (x), for a.e x with \x\ < N. To do this, decompose F 0 as G + H, where 
G(x) = Fo(x) when \x\ > 2N, G(x) = 0 when \x\ > 27V; thus H(x) = 0 
if \x\ < 2N but \H(x)\ < \Fq(x)\. Note that now G G L 1 and 



F 0 (x — t)V y {t) dt = 



G{x — ty^Pyi^t) dt 



t)V y {t) dt. 


Therefore, by the above mentioned theorem in Chapter 3, the first in¬ 
tegral on the right-hand side converges for a.e x to G(x) = Fq(x) when 
\x\ < N. While when \x\ < N the integrand of the second integral van¬ 
ishes when \t\ < N (since then |x —< 2N). That integral is therefore 
majorized by 

f / \H(x-t)\ s dt) f f \V y {t)\ 2 dt 

\JR J \J\t\>N 



1 /2 

However (J R 1^(^ — i)\ 2 dt) < ||-Fo||l 2 ? while (as is easily seen) 
Jj £ | >iV \V y (t)\ 2 dt —> 0 as y —»• 0. Hence F(x + iy) —> F 0 (x) for a.e x with 
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\x\ < A/ - , as y —> 0, and since N is arbitrary, the proof of Theorem 2.3 is 
now complete. 

The following comments may help clarify the thrust of the above the¬ 
orems. 

(i) Let S be the subspace of L 2 (R) consisting of all functions F 0 arising in 
Theorem 2.3. Then, since the functions Fq are exactly those functions in 
L 2 whose Fourier transform is supported on the half-line (0, oo), we see 
that 5 is a closed subspace. We might be tempted to say that S consists 
of those functions in L 2 that arise as boundary values of holomorphic 
functions in the upper half-plane; but this heuristic assertion is not exact 
if we do not add a quantitative restriction such as in the definition (5) 
of the Hardy space. See Exercise 4. 

(ii) Suppose we defined P to be the orthogonal projection on the subspace 

S of L 2 . Then, as is easily seen, (P/)(^) = %⑹ / ⑹ for any / G L 2 (M); 
here x is the characteristic function of (0,oo). The operator P is also 
closely related to the Cauchy integral. Indeed, if F is the (unique) 
element in whose boundary function (according to Theorem 2.3) 

is P(/), then 



To prove this it suffices to verify that for any / G L 2 (R) and any fixed 
z = x iy ^ we have 



This is proved in the same way as the Poisson integral representation (10) 
except here we use the identity (11) instead of (12). The details may be 
left to the interested reader. Also, the reader might note the close analogy 
between this version of the Cauchy integral for the upper-half plane, and 
a corresponding version for the unit disc, as given in Example 2, Section 4 
of Chapter 4. 

(iii) In analogy with the periodic case discussed in Exercise 30 of Chap¬ 
ter 4, we define a Fourier multiplier operator T on M to be a linear 
operator on L 2 (M) determined by a bounded function m (the multi¬ 
plier), such that T is defined by the formula (T/)(^) = m(^)/(^) for 
any / G L 2 (R). The orthogonal projection P above is such an operator 
and its multiplier is the characteristic function x(^). Another closely 
related operator of this type is the Hilbert transform H defined by 
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P = i+ 2 H . Then H is a, Fourier multiplier operator corresponding to the 
multiplier isign(^). Among the many important properties of H is its 
connection to conjugate harmonic functions. Indeed, for / a real-valued 
function in L 2 (M), / and H 、 f) are, respectively, the real and imaginary 
parts of the boundary values of a function in the Hardy space. More 
about the Hilbert transform can be found in Exercises 9 and 10 and 
Problem 5 below. 


3 Constant coefficient partial differential equations 

We turn our attention to solving the linear partial differential equation 
(13) L(u) = /, 


where the operator L takes the form 


L 


E 


I 

dx 


with a a G C constants. 

In the study of the classical examples of L, such as the wave equation, 
the heat equation, and Laplace’s equation, one already sees the Fourier 
transform entering in an important way. For general L, this key role 
is further indicated by the following simple observation. If, for example, 
we try to solve this equation with both u and / elements in <S, then this 
is equivalent to the algebraic equation 


where P(^) is the characteristic polynomial of / defined by 

P ⑹二 E a a (2niO a . 

\ot\<n 


This is because one has the Fourier transform identity 


Thus a solution u in the space S (if it exists) would be uniquely deter¬ 
mined by 




m 

P(0' 


4 See for example Chapters 5 and 6 in Book I. 
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In a more general setting, matters are not so easy: aside from the ques¬ 
tion of defining (13), the Fourier transform is not directly applicable; 
also, solutions that we prove to exist (but are not unique!) have to be 
understood in a wider sense. 

3.1 Weak solutions 

As the reader may have guessed, it will not suffice to restrict our attention 
to those functions for which L{u) is defined in the usual way, but instead 
a broader notion is needed, one involving the idea of “weak solutions . 55 
To describe this concept, we start with a given open set in and 
consider the space which consists of the indefinitely differentiable 

functions 5 having compact support in fi . 6 We have the following fact. 

Lemma 3.1 The space is dense in L 2 (Q) in the norm || - ||_l 2 ⑼. 

The proof is essentially a repetition of that of Lemma 1.2. We take the 
precaution of modifying the definition of gM given there to be: gM ⑷ = 
f(x) if \x\ < M, d(x,f] c ) > 1/M and \f(x)\ < M, and ^m(^) = 0 oth¬ 
erwise. Also, when we regularize we replace it with gM * ^ 5 , with 
5 < 1/2M. Then the support of gM * (fs is still compact and at a distance 
> 1 / 2 M from f 2 c . 

We next consider the adjoint operator of L defined by 



The operator L* is called the adjoint of L because, in analogy with 
the definition of the adjoint of a bounded linear transformation given in 
Section 5.2 of the previous chapter, we have 


(14) (Lcp, = ((/?, whenever ip , 咕 & Cg° ⑼ 


where denotes the inner product on L 2 (Q) (which is the restriction 
of the usual inner product on L 2 (R d )). The identity (14) is proved by 
successive integration by parts. Indeed, consider first the special case 
when L = d/dxj, and then L* = —d/dxj. If we use Fubini’s theorem, 
integrating first in the Xj variable, then in this case (14) reduces to the 


5 Indefinitely differentiable functions are also referred to as C°° functions, or smooth 
functions. 

6 This means that the closure of the support of /, as defined in Section 1 of Chapter 2, 
is compact and contained in fl. 
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familiar one-dimensional formula 



-0 dx 



with the integrated boundary terms vanishing because of the assumed 
support properties of -0 (or ip). Once established for L = d/dxj, I < j < 
n, then (14) follows for L = (d/dx) a by iteration, and hence for general 
L by linearity. 

At this point we digress momentarily to consider besides Cq°(^1) some 
other spaces of differentiable functions on that will be useful later. 
The space C n (fi) consists of all functions / on that have continuous 
partial derivatives of order < n. Also, the space C n (0) consists of those 
functions on that can be extended to functions in that belong to 
C n (M d ). Thus, in an obvious sense, we have the inclusion relation 

C C n (fl) C C n (f2), for each positive integer n. 

Returning to our partial differential operator L, it is useful to observe 
that the formula 


(Lu» (u, L*-ip) 


continues to hold (with the same proof) if we merely assume that u G 
C n (fl) without assuming it has compact support, while still supposing 

In particular, if we have L(u) = / in the ordinary sense (sometimes 
called the “strong” sense), which requires the assumption that u G C n (ft) 
in order to define the partial derivatives entering in Lu, then we would 
also have 


(15) 


(/ ， V0 = (w, L*^) for all ^ G C§° ⑼ . 


This leads to the following important definition: if / G L 2 (Q), a function 
u G L 2 (J1) is a weak solution of the equation Lu = / in if (15) holds. 
Of course an ordinary solution is always a weak solution. 

Significant instances of weak solutions that are not ordinary solutions 
already arise in elementary situations such as in the study of the one¬ 
dimensional wave equation. Here L{u) = (d 2 u/dx 2 ) — {d 2 u/dt 2 )^ so the 
underlying space is M 2 = {(^ 1 ,^ 2 ) ： with x\ = x, X 2 = t}. Suppose, for 
example, we consider the case of the “plucked string. 5,7 We are then 


7 See Chapter 1 in Book I. 
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looking at the solution of L{u) = 0 subject to the boundary conditions 
u{x^ 0) = f(x) and (du/dt)(x, 0) = 0 for 0 < x < 7r, where the graph of 
f is piecewise linear and is illustrated in Figure 2. 



Figure 2. Initial position of a plucked string 


If one extends / to [—7r, tt] by making it odd, and then to all of M 
by periodicity (of period 2 丌 ), then the solution is given by d’Alembert’s 
formula 


u(x,t )= 


/(x + t) + f(x - t) 
2 


In the present case u is not twice continuously differentiable, and it is 
therefore not an ordinary solution. Nevertheless it is a weak solution. 
To see this, approximate / by a sequence of functions f n that are C°° 
and such that / n —> / uniformly on every compact subset of R. 8 If we 
define u n (x,t) as [f n (x +1) + f n (x — t)]/2, we can check directly that 
L{u n ) = 0 and hence (以 n , L*0) = 0 for all # G Cq°(M 2 ), and thus by 
uniform convergence we obtain that = 0 as desired. 

A different example illustrating the nature of weak solutions arises for 
the operator L = d/dx on M. If we suppose = (0,1), then with u and 
/in L 2 (f]), we have that Lu = / in the weak sense if and only if there is 
an absolutely continuous function F on [0, 1] such that F(x) = u(x) and 
F\x) = f(x) almost everywhere. For more about this, see Exercise 14. 


3.2 The main theorem and key estimate 

We now turn to the general theorem guaranteeing the existence of solu¬ 
tions of partial differential equations with constant coefficients 

Theorem 3.2 Suppose Q is a bounded open subset ofR d . Given a linear 
partial differential operator L with constant coefficients, there exists a 


8 One may write, for example, f n = f * where {(/? e } is the approximation to the 

identity, as in the proof of Lemma 1.2. 
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bounded linear operator K on L 2 (Q) such that whenever f G L 2 (p), then 
L(Kf) = f in the weak sense. 


In other words, u = K[f) is a weak solution to L{u) = /. 

The heart of the matter lies in an inequality that we state next, but 
whose proof (which uses the Fourier transform) is postponed until the 
next section. 

Lemma 3.3 There exists a constant c such that 

\\fp\\L 2 (n) < c\\L*ip\\ L 2^ whenever 咕 & C^(Q). 

The usefulness of this lemma comes about for the following reason. 
If L is a finite-dimensional linear transformation, the solvability of L 
(the fact that it is surjective) is of course equivalent with the fact that 
its adjoint L* is injective. In effect, the lemma provides the analytic 
substitute for this reasoning in an infinite-dimensional setting. 

We first prove the theorem assuming the validity of the inequality in 
the lemma. 

Consider the pre-Hilbert space Ho = equipped with the inner 

product and norm 


〈灼妁 = || V ^ = || L >|| i2 ⑼. 

Following the results in Section 2.3 of Chapter 4, we let H denote the 
completion of Ho- By Lemma 3.3, a Cauchy sequence in the || - ||o-norm 
is also Cauchy in the L 2 (f2)-norm; hence we may identify TC with a 
subspace of L 2 (Vl). Also, L*, initially defined as a bounded operator 
from 7Yo to L 2 (J1), extends to a bounded operator L* from H to L 2 (Vt) 
(by Lemma 1.3). For a fixed f G consider the linear map £n : 

Cg°(Q) C defined by 


4 ⑷ = O, /) for ip G CS°(Q). 

The Cauchy-Schwarz inequality together with another application of 
Lemma 3.3 yields 


Ko ⑷卜 1(^,/)l < 11^11^(^2)11/11^(0) 

< c l|r* 妒 llL 2 (f!)||/IU 2 (n) 

< c’ll 妒 llo ， 
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with d = c||/||^ 2 (q\. Hence £o is bounded on the pre-Hilbert space Ho. 
Therefore, £ extends to a bounded linear functional on H (see Section 5.1 ， 
Chapter 4), and the above inequalities show that \\£\\ < c||/|| L 2(q). By 
the Riesz representation theorem applied to £ on the Hilbert space TL 
(Theorem 5.3 in Chapter 4), there exists U such that 

⑽ = (^, U) = (L* 也 L*C7) for all ^ G Cg°(Q). 


Here (■, •) denotes the extension to 7i of the initial inner product on 7Yo, 
and L* also denotes the extension of L* originally given on TLq. 

If we let u = L*U, then u G I/ 2 (f2), and we find that 

⑽ = ( 也 /) = for all ^ e C^°(R d ). 

Hence 

L* 奶 for all 诊 e C^(R d ), 

and by definition, w is a weak solution to the equation Lu = / in If 
we let Kf = u, we see that once / is given, Kf is uniquely determined 
by the above steps. Since \\U\\o = \\i\\ < c||/|| L 2 (^) we see that 

ll^7l|L 2 (f2) = IM| L 2 (f2) = \\L*U\\ L 2 (n) = ||C/||o < c||/||L 2 (n) , 
whence K : L 2 (f2) —> L 2 (J1) is bounded. 

Proof of the main estimate 

To complete the proof of the theorem, we must still prove the estimate 
in Lemma 3.3, that is, 

ll^lli^n) < c\\L*i/j\\ L 2 {q) whenever ^ G ⑼. 

The reasoning below relies on an important fact: if / has compact 
support in M, then /(() initially defined for ^ G M extends to an entire 
function for ^ ^ + zry G C. This observation reduces the problem to an 

inequality about holomorphic functions and polynomials. 

Lemma 3.4 Suppose P(z) = z 171 + • • • + a\z + ao is a polynonial of de¬ 
gree m with leading coefficient 1. If F is a holomorphic function on C ， 
then 

i r 2n 

|^(0)| 2 < — / \P(e l9 )F(e ld )\ 2 de. 

^ JO 
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Proof. The lemma is a consequence of the special case when P = 1 

-I /»27T /»27T 

(16) |F( ° )|2 - 2^ X J 0 \ F ^ e )\ 2de - 

This assertion follows directly from the mean-value identity (8) in Sec¬ 
tion 2 with C = 0 and r = 1， via the Cauchy-Schwarz inequality. With it 
we begin by factoring P: 

p { z ) = Y [( z ~ a ) Y [( z ~ 0 )= 尸 1(:)巧(:)， 
i « i>i \p\<i 

where each product is finite and taken over the roots of P whose absolute 
values are > 1 and < 1, respectively. 

Note that | 巧 (0)| 二 ]1|«|>1 l a l > 1 - 
For P 2 we write 


( z -/ 3 ) = -(1 - ㈣ * ㈤ , 

where = ^~-^ z are the “Blaschke factors” that have the obvious 

property that they are holomorphic in a region containing the closed 
unit disc and |V^(e z0 )| = 1; see also Chapter 8 in Book II. We write 
P 2 = n|/3|<i(l-^) and P = PiP 2 . Thus |P(0)| > 1, while \P(e ie )\ = 
\P(e l6 )\ for every 6. We now apply (16) to the function PF in place of 
F and find that 


|F(0)| 2 < |P(0)F(0)| 2 < 



\P(e ie )F(e id )\ 2 dO 
\P(e ie )F(e ie )\ 2 dd, 


which gives the desired conclusion. 

We turn to the proof of the inequality ||^|| < c||iH| for all # G 
in the special case of one dimension, that is, f] C M. 

Suppose / is an L 2 function supported on the interval [—M, M]. Then 


m 



f(x)e~ 2nix ^ dx 


whenever ^ G M. In fact, the above integral converges whenever ^ is re¬ 
placed by ^ + iry G C, and we may extend / to a holomorphic func¬ 

tion of in the whole complex plane. An application of the Plancherel 
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formula (for fixed rj) yields 

[°° \f^ + iv)\ 2 d^<e^ M ^ r \f(x)\ 2 dx. 

J — oo J —oo 

We use this observation in the following context. We may assume (upon 
multiplying L by a suitable constant) that 

心 E (-1) 嗜)' 

0<k<n 乂 ) 

where a n = {2ni)~ n . If we let Q(^) = Zlo<fc<n( _1 ) fc ^( 27r ^) fc be its 
characteristic polynomial, then we note that 

LHC) = Q(0 彡 (0 whenever ^ G Cq°(R). 


If M is chosen so large that f] C [—M, M], then our previous observation 
gives 


(17) 


|Q(( + + irj)\ 2 d^ < e 47rM ^l / \L*^(x)\ 2 dx. 


Picking r] = i sin 0, and making a translation by cos 0 yields 



|Q(C + cos0 + isin 0)^(^ + cos0+zsin0)| 2 d^ < 


< e 47rM 



\L^^{x)\ 2 dx. 


An application of Lemma 3.4 with F(z) = + z) and Q(^ + z) in place 

of P(z) then gives 

八 1 f 2n A 

|^(0| 2 ^ 7 ^ / |Q(^ + cos6 + isin9)^(^ + cos 6 + isin 9)\ 2 d6. 

27r Jo ' 

We now integrate in ^ over M, and on the right-hand side interchange 
the order of the ^ and 9 integrations; also by translation invariance we 
replace the integration in the ^ variable by that in the variable ^ + cos 6. 
Using (17) the result is 

^ ^ • / [ I Q(C + * sin (9)+ i sin 0) 1 2 d(9 

Z7T Jo Jr 

< e 4vM I. \L*iP(x)\ 2 dx, 
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which by Plancherel’s identity proves the main lemma in the one-dimensional 
case. 

The higher dimensional case is a modification of the argument above. 

Let Q = ^| a | <n (—l) a a a (27ri^) Q! be the characteristic polynomial of L*. 
Then we can choose a new set of orthogonal axes (whose coordinates we 
denote by ( 心 ，…， ^)) so that if ^ = K’) with 疒 = ( 心 ，… ，（ d )，then 
after multiplying by a suitable constant 


n—l 

(18) Q ⑹ = ( 2 ttW + [ 召％ (O ， 

j=0 

where qj(^) are polynomials of ^ (of degrees < n — j). 

To see that such a choice is possible, write Q = Q n + Q\ where Q n is 
homogeneous of degree n and Q' has degree < n. Then since we may 
assume Q n ^ 0 there is (after multiplying Q by a suitable constant), 
a unit vector 7 so that Q n ( 7 ) = Then Q n (C) = (2m)~ n r n if 

^ = 7 r, r G M. We can then take the 心 -axis to lie along 7 , and the 
€ 2 ,… ,^-axes to be in mutually orthogonal directions, from which the 
form (18) is clear. 

Proceeding now as before we obtain 

八 i r 27r 八 

m(ci ， oi 2 s y iq( 6_++e'oi 2 洲 

for each (^ 1 ,^) G M d . An integration 9 then gives 

|| 必 ||! 2 ㈣ , -^ J o J d \Q{^i+isinsine,^')\ 2 d^dd. 

If we suppose that the projection of the (bounded) set Q on the Xi-axis 
is contained in [—M, M], we see as before that the right-hand side above 
is majorized by e 4irM J Rd |L*^(o ;)| 2 dx, finishing the proof of Lemma 3.3 
and hence that of the theorem. 

4* The Dirichlet principle 

Dirichlet’s principle arose in the study of the boundary-value problem 
for Laplace’s equation. Stated in the case of two dimensions it refers to 
the classical problem of finding the steady-state temperature of a plate 


9 We note that by the rotational invariance of Lebesgue measure (Problem 4 in Chap¬ 
ter 2 and Exercise 26 in Chapter 3), integration in ^ can be carried out in the new 
coordinates as well. 
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whose boundary is exposed to a given temperature distribution. The 
issue raised is the following question, called the Dirichlet problem: 
If is a bounded open set in M 2 and / a continuous function on the 
boundary dfl, we wish to find a function u(xi,X 2 ) such that 


(19) 


f Au = 0 in f], 

I w = / on dQ. 


Thus we need to determine a function that is C 2 (twice continuously 
differentiable) in whose Laplacian 10 is zero, and which is continuous 
on the closure of fi, with u\q^i = /. 

With either Q or f satisfying special symmetry conditions, the solution 
to this problem can sometimes be written out explicitly. For instance, if 
is the unit disc, then 


u(re ie ) 二 —j f(^p)P r {0 - tp) dip, 

where P r is the Poisson kernel (for the disc). We also obtained (in Books I 
and II) explicit formulas for the solution of the Dirichlet problem for some 
unbounded domains. For example, when is the upper half-plane the 
solution is 

u(x, y)-[ Vy(x - t)f(t) dt, 

JR 


where V y (x) is the analogous Poisson kernel for the upper half-plane. A 
somewhat similar convolution formula was obtained when is a strip. 
Also, the Dirichlet problem can be solved explicitly for certain f] by using 
conformal mappings. * 11 

In general, however, there are no explicit solutions, and other methods 
must be found. An idea that was used intially was based on an approach 
of wide utility in mathematics and physics: to find the equilibrium state 
of a system one seeks to minimize an appropriate “energy” or “action.” 
In the present case the role of this energy is played by the Dirichlet 
integral, which is defined for appropriate functions U by 


m) 


I ▽听 


dU 


dx\ 


dU 


dx2 


dxidx2. 


(Note the similarity with the expression of the “potential energy” in the 
case of the vibrating string in Chapters 3 and 6 of Book I.) In fact, 


10 The Laplacian of a function u in R d is defined by Au = d 2 u/dx^. 

11 The close relation between conformal maps and the Dirichlet problem is discussed in 
the last part of Section 1 of Chapter 8, in Book II. 
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that approach underlies the proof Riemann proposed for his well-known 
mapping theorem. About this early history R. Courant has written: 

Already some years before the rise of Riemann’s genius, 

C.F. Gauss and W. Thompson had observed that the bound¬ 
ary value problem of the harmonic differential equation Au = 
u xx + u yy = 0 for a domain G in the x, y-plane can be re¬ 
duced to the problem of minimizing the integral T>[(/)] for the 
domain G, under the condition that the functions (j) admitted 
to competition have the prescribed boundary values. Because 
of the positive character of T>[(j)\ the existence of a solution 
for the latter problem was considered obvious and hence the 
existence for the former assured. As a student in Dirichlet’s 
lectures, Riemann had been fascinated by this convincing ar¬ 
gument: soon afterwards he used it, under the name “Dirich- 
let’s Principle,” in a more varied and spectacular manner as 
the very foundation of his new geometric function theory. 

The application of Dirichlet’s principle was thought to have been jus¬ 
tified by the following simple observation: 

Proposition 4.1 Suppose there exists a function u G C 2 (f2) that mini¬ 
mizes T>{U) among all U G C 2 (f2) with U\qq = /. Then u is harmonic 
in f2. 

Proof. For functions F and G in C 2 (J1) define the following inner- 
product 



We then note that V(u) = (u, u). If v is any function in C 2 (f2) with 
= 0, then for all e we have 


V{u + ev) > V{u) 


since u-\- ev and u have the same boundary values, and u minimizes the 
Dirichlet integral. We note, however, that 


T>{u + ev) = T>{u) + e 2 V(v) + e(u, v) + e(v, u). 


Hence 


e 2 V(v) + e(u : v) + e(v,u) > 0, 
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and since e can be both positive or negative, this can happen only if 
Re(u, v) = 0. Similarly, considering the perturbation u + iev, we find 
Im(n, v) = 0, and therefore (u, v) = 0. An integration by parts then pro¬ 
vides 

0 = (u,v) = — / (Au)v 

Jn 

for all v G C 2 (yt) with = 0. This implies that Au = 0 in fi, and of 
course u equals / on the boundary. 


Nevertheless, several serious objections were later raised to Dirich- 
let’s principle. The first was by Weierstrass, who pointed out that it 
was not clear (and had not been proved) that a minimizing function for 
the Dirichlet integral exists, so there might simply be no winner to the 
implied competition in Proposition 4.1. He argued by analogy with a 
simpler one-dimensional problem: that of minimizing the integral 

D(cp) = J \xcp\x)\ 2 dx 

among all C 1 functions on [—1,1] that satisfy p(—1) = —1 and (p(l) = 1. 
The minimum value achieved by this integral is zero. To verify this, let 
^ be a smooth non-decreasing function on M that satisfies ^(x) = 1 for 
x > 1, and ^(x) = —1 if x < —1. For each 0 < e < 1, we consider the 
function 

{ 1 if e < $, 

^(x/e) if —e < x < e, 

—1 if x < —e. 

Then (p e satisfies the desired constraints, and if M denotes a bound for 
the derivative of 也 we find 

D(tp e ) = J \x\ 2 \e^ l il)'{x/e)\ 2 dx 

< J \-tp\x/e)\ 2 dx 

< 2eM 2 . 

In the limit as e tends to 0, we find that the minimum value of the integral 
D((f) is zero. This minimum value cannot be reached by a C 1 function 
satisfying the boundary conditions, since Di^p) = 0 implies ^p r {x) = 0 and 
thus (p is constant. 
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A further objection was raised by Hadamard, who remarked that T>{u) 
may be infinite even for a solution u of the boundary value problem: 
thus, in effect, there may simply be no competitors who qualify for the 
competition! 

To illustrate this point, we return to the disc, and consider the function 

oo 

/(0)=/ a (0)=[2-〜 2 " e 

n=0 

for a > 0. This function first appeared in Chapter 4 of Book I, where it 
is shown that / a is continuous but nowhere differentiable if a < 1. The 
solution of the Dirichlet problem on the unit disc with boundary value 
f a is given by the Poisson integral 

u(r, 9) = fr 2r V n V 2 ' 
n=0 


However, the use of polar coordinates gives 


du 

2 

du 

2 

du 

2 i 

du 

dx\ 

+ 

dx 2 


dr 

丁 r 2 

d6 


Thus 

) d6rdr 

where D p is the disc of radius 0 < p < 1 centered at the origin. Since 



du 


dx\ 


du 


dxo 


dx\dx2 



du 

dr 


du 

dr 


y~]2 ra 2-v 2 、 


and 


du 

09 


V r 2 "2 _na i2 n e i2 ' 


applications of Parseval’s identity lead to 



du 


dxn 


d Xl dx 2 ^ / f2 22n+l2 ~ 2nar2U+1 ~ ldr 

n=0 

= fx +1 2"2 - 


-2na 


n=0 


which tends to infinity as p —^ 1 if a < 1/2. 

One can formulate this objection in a more precise way by appealing 
to the result in Exercise 20. 
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Despite these significant difficulties, Dirichlet’s principle can indeed be 
validated, if applied in the appropriate way. A key insight is that the 
space of competing functions arising in the proof of the above proposition 
is itself a pre-Hilbert space, with inner product〈.，_〉given there. The 
desired solution lies in the completion of this pre-Hilbert space, and this 
requires the L 2 theory for its analysis. These ideas were clearly not 
available at the time Dirichlefs principle was first formulated and used. 

In what follows we shall describe how these additional concepts can 
be exploited. We will begin our presentation in the more general d- 
dimensional setting, but conclude with the application of these tech¬ 
niques to the solution of the two dimensional problem (19). As an impor¬ 
tant preliminary matter we start with the study of some basic properties 
of harmonic functions. 


4.1 Harmonic functions 

Throughout this section will denote an open subset of A function u 
is harmonic in O if it is twice continuously differentiable 12 and u solves 


a ^d 2 u ^ 

△K 


We shall see that harmonic functions can be characterized by a number 
of equivalent properties. 13 Adapting the terminology used in Section 3, 
we say that u is weakly harmonic in O if 


(20) (u, A-0) = 0 for every ^ G C§° (Q) • 


Note that the left-hand side of (20) is well-defined for any u that is inte- 
grable on compact subsets of fi. Thus, in particular, a weakly harmonic 
function needs to be defined only almost everywhere. Clearly, however, 
any harmonic function is weakly harmonic. 

Another notion is the mean-value property generalizing the iden¬ 
tity (9) in Section 2 for holomorphic functions. A continuous function u 
defined in f] satisfies this property if 

(21) ^(^o) = —thy / dx 

m \ B ) Jb 

for each ball B whose center is xq and whose closure B is contained in f]. 


12 In other words, u is in C 2 (Q) in the notation of Section 3.1. 

13 Note that in the case of one dimension, harmonic functions are linear and so their 
theory is essentially trivial. 
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The following two theorems give alternative characterizations of har¬ 
monic functions. Their proofs are closely intertwined. 

Theorem 4.2 If u is harmonic in Q, then u satisfies the mean-value 
property (21). Conversely, a continuous function satisfying the mean- 
value property is harmonic. 

Theorem 4.3 Any weakly harmonic function u in can be corrected 
on a set of measure zero so that the resulting function is harmonic in Q. 

The above statement says that for a given weakly harmonic function u 
there exists a harmonic function u, so that u(x) = u(x) for a.e. x G 0. 
Notice since u is necessarily continuous it is uniquely determined by u. 

Before we prove the theorems, we deduce a noteworthy corollary. It is 
a version of the maximum principle. 

Corollary 4.4 Suppose Q is a bounded open set, and let dft = O — f] 
denote its boundary. that u is continuous in O and is harmonic 

in ft. Then 


max \u(x)\ 


max \u(x)\. 
xedn 


Proof. Since the sets and dfl are compact and u is continuous, the 
two maxima above are clearly attained. We suppose that \u(x)\ 

is attained at an interior point xq G for otherwise there is nothing to 
prove. 

Now by the mean-value property, \u(xq)\ < f B \u(x) \ dx. If for 
some point x r ^ B we had |ii(x / )| < \u(xo)\, then a similar inequality 
would hold in a small neighborhood of x’，and since \u(x)\ < |w(xo)| 
throughout S, the result would be that 叫 1 ^) f B \u(x)\ dx < |ii(Xo)|, which 
is a contradiction. Hence \u(x)\ = | 以 ($0)1 for each x E B. Now this is 
true for each ball B r of radius r, centered at Xq, such that B r C 0. Let 
r*o be the least upper bound of such r; then B ro intersects the boundary 
f] at some point x. Since \u(x)\ = |n(xo)| for all x G B r , r < ro, it follows 
by continuity that \u(x)\ = \u(xo)\, proving the corollary. 

Turning to the proofs of the theorems, we first establish a variant 
of Green’s formula (for the unit ball) that does not explicitly involve 
boundary terms. 14 Here and rj are assumed to be twice continuously 
differentiable functions in a neighborhood of the closure of B, but rj is 
also supposed to be supported in a compact subset of B. 


14 The more usual version requires integration over the (boundary) sphere, a topic 
deferred to the next chapter. See also Exercises 6 and 7 in that chapter. 
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Lemma 4.5 We have the identity 


/ (vAu — uAv)r] dx = / u(Vv - Vry) — v{Vu - Vrj) dx. 
! B j B 


Here Wu is the gradient of that is, Vu ■ 


du du 
dxi "> dx2 ' 


為腿 1 


Vt; - Vr/ = 


dv dr] 
dxj dxj’ 


with \/u - \/rj defined similarly. 

In fact, by integrating by parts as in the proof of (14) we have 


[^vr,dx^- [ u^ V dx 
IB dx 3 JB dx 3 


[uv^dx. 
Ib dx 3 


We then repeat this with u replaced by dujdx^ and sum in j to obtain 
/ (Au)vr]dx = — (\/u - \/v)rj dx — (\/u - \/rj)v dx. 


> B 


IB 


This yields the lemma if we subtract from this the symmetric formula 
with u and v interchanged. 

We shall apply the lemma when ^ is a given harmonic function, while 
v is one of the three following “test” functions: first, v(x) = 1 ; second, 
v(x) = \x\ 2 ; and third, v(x) = \x\~ d+2 if d > 3, while v(x) = log \x\ if 
d = 2. The relevance of these choices arises because Av = 0 in the first 
case, while Av is a non-zero constant in the second case; also v in the 
third case is a constant multiple of a ^fundamental solution,” and in 
particular v(x) is harmonic for x ^ 0. 

When v(x) = 1, we take rj = " 之， where r]^(x) = 1 for \x\ < 1 — e, 
ry + (x) = 0 for |o;| > 1, and |V? 7 +(x)| < c/e. We accomplish this by setting 

77 + (x) = x ( 卜卜 / +€ ) for 1 — e < \x\ < 1 , where x is a fixed C 2 function 

on [0,1] that equals 1 in [0,1/4] and equals 0 in [3/4, 1 ]. A picture of 77 + 
is given in Figure 3. 

Since u is harmonic, we see that with v = 1, Lemma 4.5 implies 


( 22 ) 


Vu - dx = 0. 


'B 


Next we take v(x) = |x| 2 ; then clearly Av = 2d, and with rj = the 
lemma yields: 

2d / ur ] 言 dx = \x\ 2 {Vu ■ dx — 2 / u(x - Vr/^) dx. 

J B JB JB 
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Figure 3. The function 


However, since is supported in the spherical shell = {x : 1 — 6 < 
\x\ < 1}, we see that 

/ •'Vrjf) dx = / (Vu - Vry^) dx + 0(e), 


>B 


and hence by (22) we get 


(23) 


d / udx 


'B 


lim / u(x - Vry^) dx. 


We finally turn to v(x) = \x\~ d+2 , when d > 3, and calculate (Av)(x) 
for x # 0 to see that it vanishes there. In fact, since d\x\/dxj = Xj/\x\, 
we note that 


d\x\ a 


I CL — 2 


and 


d 2 \x\ a 


x\ a ~ 2 + a(a — 2)x Z j\x 


a—4 


Upon adding in we obtain that A(|x| a ) = [da + a(a — 2)]|$| a_2 , and 
this is zero if a = —d + 2 (or a = 0). A similar argument shows that 
A (log |x|) = 0 when d = 2 and x ^ 0. 

We now apply the lemma with this v and rj = r] e defined as follows: 

Ve(x) = 1 - x(klA) for |x| < e, 

r] e (x) = 1 for e < |or| < 1 — e, 

Ve(x) = ri+ (x) = x f |x| ~ 1+£N ) for 1 - e < \x\ < 1. 


The picture for ry e is as follows (Figure 4): 
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Figure 4. The function r] e 


We note that \\/r] e \ is 0(l/e) throughout. Now both u and v are 
harmonic in the support of rj e , and in this case \/rj e is supported only 
near the unit sphere (in the shell ) or near the origin (in the ball 
B e = {|a;| < e}). Thus the right-hand side of the identity of the lemma 
gives two contributions, one over and the other over B e . We consider 
the first contribution (when d > 3); it is 

/ uV(\x\~ d+2 )-Vr] e dx — / \x\~ d+2 (Vu - Vry^) dx. 

J st J St 

Now the first integral is (—d + 2) f s + u\x\~ d {x - Vr/+) dx^ which by (23) 
tends to c f B udx as e 0, where c is the constant (2 — d)d, since \x\~ d — 
1 = 0(e) over . The second term tends to zero as e —»• 0 because of (22) 
and the fact that the integrand there is supported in the shell S^. A 
similar argument for d = 2, with v(x) = log \x\, yields the result with 
c = 1. 

To consider the contribution near the origin, that is, over B e , we tem¬ 
porarily make the additional assumption that 以 (0) = 0. Then because 
of the differentiability assumption satisfied by a harmonic function, we 
have u(x) = 0(|x|) as \x\ —^ 0. Now over B e we have two terms, the first 
being f B uV(\x\~ d+2 )Vrj e dx, which is majorized by 

[ 0(e)\x\~ d+1 0(l/e) dx <0 ( [ \x\~ d+1 dx ) < O(e), 

JB e \i|x|<e / 


because of (8) in Section 2 of Chapter 2. This term tends to 0 with e. 
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The second term is f B \x\~ d+2 (Vu - Vr/e) dx, which is majorized by 


using the result just cited. We have used the fact that \/u is bounded 
and Vry e is 0(l/e) throughout B. Letting e ^ 0 we see that this term 
tends to zero also. A similar argument works when d = 2. 

Thus we have proved that if u is harmonic in a neighborhood of the 
closure of the unit ball S, and u(0) = 0, then J B udx = 0. We can drop 
the assumption w(0) = 0 by applying the conclusion we have just reached 
to u(x) — w(0) in place of u(x). Therefore we have achieved the mean- 
value property (21) for the unit ball. 

Now suppose = {x : \x — Xq\ < r} is the ball of radius r cen¬ 

tered at Xq ，and consider U(x)= 以 (: r。+ tx). If we suppose that u is har¬ 
monic in B r (Xo), then clearly U is harmonic in the unit ball (indeed, the 
property of being harmonic is unchanged under translations x ^ x -\- Xq 
and dilations x —> rx, as is easily verified). Thus if u were supported in 
and B r (xo) C then by the result just proved U(0) = m ( B ) f B U{x) dx, 
which means that 

+ 0 ) = + j^u{ Xo + X )d X 

= ^d(^)L <x)dx ， 

by the relative invariance of Lebesgue measure under dilations and trans¬ 
lations. This establishes (21) in general. 



The converse property 

To prove this, we first show that the mean-value property allows a useful 
extension of itself. For this purpose, we fix a function ^p(y) that is contin¬ 
uous in the closed unit ball {\y\ < 1} and is radial (that is, (p{y) = $(|y|) 
for an appropriate $), and extend if to be zero when \y\ > 1 . Suppose 
in addition that f (p(y) dy = 1. We then claim the following: 

Lemma 4.6 Whenever u satisfies the mean-value property (21) in 

and the closure of the ball {x : \x — Xq| < r} lies in Q, then 

(24) 

u ( x 0 ) = I u(xo - ry)(p(y) dy= u(x 0 - y)ip r (y) dy = (u * (p r )(x 0 ), 

JR d JR d 

where ^p r {jj) = r~ d (p(y/r). 
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That the second of the two identities holds is an immediate consequence 
of the change of variables y —> y/r\ the rightmost equality is merely the 
definition of ip r . 

We can prove (24) as a consequence of a simple observation about 
integration. Let ^(y) be another function on the ball {\y\ < 1}, which 
we assume is bounded. For each iV, a large positive integer, denote by 
B(j) the ball {\y\ < j/N}. Recall that ip(y) = 少 （|?/|). Then 


(25) 


v{y)^{y) dy 


N 

lim $ 

N—oo 



/ ^(y) dy. 


To verify this, note that the left-hand side of (25) equals 


T 2 / dy. 

j=l JB(j)-B(j-1) 

However, \ip(y) - ^(j/ N )\ = e N, which tends 

to zero as TV —>• oo, since (p is radial, continuous, and (f(y)= 少 (M). Thus 
the left-hand side of (25) differs from 紙 j/N) SB{j)-B(j-i) ^(l/) dy 

by at most e N / |2/| < 1 \^{y)\dy, proving (25). 

We now use this in the case where 寸 (y)= 以 (o；o — ^y) and (p is as before. 
Then 

/ _ ry)(f(y)dy = lim ) / u(x 0 - ry) dy. 

J \ N J Jb^-bu-i) 


However, it follows from the mean-value property assumed for u that 


/ u(x 0 - ry) dy = u(x 0 )[m(B(j)) - m(B(j - 1))]. 

Therefore, the right-hand side above equals 


u 


N 


(xn) lim $ 
N—oo ^ 



/ dy, 

JBU)-B(j-l) 


and this is u(xq) if we use (25) again, this time with 0=1， and recall 
that f ^p(y) dy = 1. We have therefore proved the lemma. 


We see from this that every continuous function which satisfies the 
mean-value property is its own regularization! To be precise, we have 


(26) 


u(x) = (Pr)(^) 
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whenever x G and the distance from x to the boundary of is larger 
than r. If we now require in addition that (p G C^{\y\ < 1}, then by the 
discussion in Section 1 we conclude that u is smooth throughout fi. 

Let us now establish that such functions are harmonic. Indeed, by 
Taylor’s theorem, for every xo G 

d 丄 d 

(27) u(x 0 +x) - u(x 0 ) 二 ^2 a 3 x i + 2^2 a 3 kX 0 X k + € ( 工)， 

J = 1 j,k=l 


where e(x) = 0(\x\ 3 ) as \x\ —> 0. We note next that J^ <r Xj dx = 0 and 
f\x\<r x j Xk dx = Q for all j and k with k ^ j. This follows by carrying 
out the integrations first in the Xj variable and noting that the integral 
vanishes because Xj is an odd function. Also by an obvious symmetry 
I\x\<r = I\x\<r x k an d by the relative dilation-invariance (see 

Section 3, Chapter 1) these are equal to r 2 f^ x ^ <r (xi/r) 2 dx = 
r d+2 J| ；r | <1 x\ dx = cr d+2 , with c > 0. We now integrate both sides of (27) 
over the ball {|x| < r}, divide by r d ， and use the mean-value property. 
The result is that 


c 

-r 


2 


二 

j=i 


— (Au)(xo) = O 



Letting r —>• 0 then gives Au(xq) = 0. Since Xq was an arbitrary point 
of the proof of Theorem 4.2 is concluded. 


Theorem 4.3 and some corollaries 

We come now to the proof of Theorem 4.3. Let us assume that u is 
weakly harmonic in For each e > 0 we define Q e to be the set of 
points in f] that are at a distance greater than e from its boundary: 

= {x G : d(x, dfi) > e}. 

Notice that f2 e is open, and that every point of Q belongs to f2 e if e 
is small enough. Then the regularization u ^ (f r = u r considered in the 
previous theorem is defined in f) e , for r < e, and as we have noted is a 
smooth function there. We next observe that it is weakly harmonic in 
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f] e . In fact, for # G we have 




by Fubini’s theorem, and the inner integral vanishes for \y\ < 1, be¬ 
cause it equals [u, A^), with = ^(x + ry). Thus we have 


{u * △ 矽 ）= 0, 


and hence u ^ cp r is weakly harmonic. Next, since this regularization is 
automatically smooth it is then also harmonic. Moreover, we claim that 


(u*ip ri )(x) = {u*ip r 2 )(x) 


(28) 


whenever x ^ Q e and r\-\- V 2 < e. Indeed, (n * cp ri ) * cp r2 = ip ri as 
we have shown in (26) above. However convolutions are commutative 
(see Remark (6) in Chapter 2); thus {u * cp ri ) * cp r2 = (n * (p r2 ) * (p ri = 
u * ip r2 , and (28) is proved. 

Now we can let r\ tend to zero, while keeping fixed. We know by the 
properties of approximations to the identity that u * (p ri (x) — > u{x) for 
almost every x in fi e ; hence u{x) equals u r2 (x) for almost every x G f2 e . 
Thus u can be corrected on (setting it equal to u r2 ), so that it becomes 
harmonic there. Now since e can be taken arbitrarily small, the proof of 
the theorem is complete. 

We state several further corollaries arising out of the above theorems. 

Corollary 4.7 Every harmonic function is indefinitely differentiable. 

Corollary 4.8 Suppose {^ n } is a sequence of harmonic functions in f] 
that converges to a function u uniformly on compact subsets of f] as 
n —^ oo. Then u is also harmonic. 

The first of these corollaries was already proved as a consequence 
of (26). For the second, we use the fact that each u n satisfies the mean- 
value property 



whenever B is a ball with center at Xq, and B C ft. Thus by the uniform 
convergence it follows that u also satisfies this property, and hence u is 
harmonic. 
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We should point out that these properties of harmonic functions on 
M. d are reminiscent of similar properties of holomorphic functions. But 
this should not be surprising, given the close connection between these 
two classes of functions in the special case d = 2. 

4.2 The boundary value problem and Dirichlet’s principle 

The d-dimensional Dirichlet boundary value problem we are concerned 
with may be stated as follows. Let be an open bounded set in M. d . 
Given a continuous function / defined on the boundary dfl, we wish to 
find a function u that is continuous in harmonic in and such that 
u = f on dft. 

An important preliminary observation is that the solution to the prob¬ 
lem, if it exists, is unique. Indeed, if u\ and U 2 are two solutions 
then u\ — U 2 is harmonic in f] and vanishes on the boundary. Thus by 
the maximum principle (Corollary 4.4) we have ui — U 2 = 0, and hence 
ui = u 2 . 

Turning to the existence of a solution, we shall now pursue the ap¬ 
proach of Dirichlet’s principle outlined earlier. 

We consider the class of functions C 1 (f2), and equip this space with 
the inner product 



where of course 


E du dv 
.一 dxj dxj 


Vw - Vv = 


With this inner product, we have a corresponding norm given by 
|| 从 || 2 = (u, u). We note that ||^|| = 0 is the same as Vu = 0 through¬ 
out Q, which means that u is constant on each connected component of 
fi. Thus we are led to consider equivalence classes in C 1 (f2) of elements 
modulo functions that are constant on components of fi. These then 
form a pre-Hilbert space with inner product and norm given as above. 
We call this pre-Hilbert space Hq. 

In studying the completion Ji of TLq and its applications to the bound¬ 
ary value problem, the following lemma is needed. 

Lemma 4.9 Let f] be an open bounded set in Suppose v belongs to 
C 1 (fi) and v vanishes on dfl. Then 
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Proof. This conclusion could in fact be deduced from the considera¬ 
tions given in Lemma 3.3. We prefer to prove this easy version separately 
to highlight a simple idea that we shall also use later. It should be noted 
that the argument yields the estimate cq < d(Q) 2 , where d(Q) is the 
diameter of Q. 

We proceed on the basis of the following observation. Suppose / is a 
function in C 1 (/), where I =( a, b) is an interval in M. Assume that / 
vanishes at one of the end-points of /. Then 

(30) Jjf(t)\ 2 dt<\I \ 2 Jjf(t)\ 2 dt, 

where \I\ denotes the length of I. 

Indeed, suppose / ⑷ = 0. Then f(s) = f s f(t) dt, and by the Cauchy- 
Schwarz inequality 

\m 2 < \i\ [ s \f(t)\ 2 dt<\i\ [ \nt)\ 2 dt. 

Ja JI 

Integrating this in s over I then yields (30). 

To prove (29), write x = with xi G M and x' G and ap¬ 

ply (30) to / defined by f(xi) = v{x\^x r )^ with x r fixed. Let J(x r ) 
be the open set in M that is the corresponding slice of f] given by 
{x\ G M : (xi,x f ) G f2}. The set J{x r ) can be written as a disjoint union 
of open intervals Ij. (Note that in fact f(x\) vanishes at both end-points 
of each Ij.) For each j, on applying (30) we obtain 

/ \v(xi^x f )\ 2 dxi < \Ij\ 2 / \^/v(xi^x f )\ 2 dx\. 

Jij Jij 

Now since \Ij\ < summing over the disjoint intervals Ij gives 

j \v{xi^x r )\ 2 dxi < d{p ) 2 j dxi^ 

J J J{x f ) 

and an integration over x' G then leads to (29). 

Now let So denote the linear subspace of C 1 ⑼ consisting of functions 
that vanish on the boundary of f2. We note that distinct elements of So 
remain distinct under the equivalence relation defining Ho (since con¬ 
stants on each component that vanish on the boundary are zero), and so 
So may be identified with a subspace of Hq. Denote by 5 the closure in 
H of this subspace, and let Ps be the orthogonal projection of H onto S. 
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With these preliminaries out of the way, we first try to solve the bound¬ 
ary value problem with / given on dQ under the additional assumption 
that / is the restriction to dfl of a function F in C 1 (f2). (How this 
additional hypothesis can be removed will be explained below.) Fol¬ 
lowing the prescription of Dirichlet’s principle, we seek a sequence {un} 
with u n G C 1 (fi) and u n \dn = -Flao? such that the Dirichlet integrals 
H^nll 2 converge to a minimum value. This means that u n = F — v n ^ 
with v n G Sq, and that lim n ^.oo ||ii n || minimizes the distance from F to 
So, Since S = So, this sequence also minimizes the distance from F to 
S in TC. 

Now what do the elementary facts about orthogonal projections teach 
us? According to the proof of Lemma 4.1 in the previous chapter, we 
conclude that the sequence {u n }, and hence also the sequence {w n }, 
both converge in the norm of 7Y, the former having a limit Ps(F). Now 
applying Lemma 4.9 to v n — Vm we deduce that {v n } and {ti n } are also 
Cauchy in the L 2 (fi)-norm, and thus converge also in the L 2 -norm. Let 
u = lim n ^ 00 u n . Then 

(31) u = F-P s (F). 

We see that u is weakly harmonic. Indeed, whenever ^ G C§°(fl), then 
-0 G 5, and hence by (31) (u,^) = 0. Therefore {u n ^) 0, but by- 

integration by parts, as we have seen, 

(u n ，讷二 f (Vu n -^)dx^~ [ u n A^dx^-(u n ,A^). 

Jq Jq 

As a result, (u, At/ ；) = 0, and so u is weakly harmonic and thus can be 
corrected on a set of measure zero to become harmonic. 

This is the purported solution to our problem. However, two issues 
still remain to be resolved. 

The first is that while u is the limit of a sequence {u^ of continuous 
functions in ft and u u \qq = /, for each n, it is not clear that u itself is 
continuous in ft and u\qq = f. 

The second issue is that we restricted our argument above to those 
f defined on the boundary of that arise as restrictions of functions 
in 

The second obstacle is the easier of the two to overcome, and this can 
be done by the use of the following lemma, applied to the set T = dft. 

Lemma 4.10 Suppose T is a compact set in R d ， and f is a continuous 
function on T. Then there exists a sequence {F n } of smooth functions 
on so that F n ^ f uniformly on T. 
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In fact, supposing we can deal with the first issue raised, then with the 
lemma we proceed as follows. We find the functions U n that are har¬ 
monic in f), continuous on f), and such that U n \dQ = F n \dQ. Now since 
the {Fn\ converges uniformly (to /) on it follows by the maximum 
principle that the sequence {U n } converges uniformly to a function u 
that is continuous on has the property that u\oq = /, and which is 
moreover harmonic (by Corollary 4.8 above). This achieves our goal. 

The proof of Lemma 4.10 is based on the following extension principle. 

Lemma 4.11 Let f be a continuous function on a compact subset T of 
W 1 . Then there exists a function G on that is continuous, and so that 

G\ dv = /• 

Proof. We begin with the observation that if Kq and K\ are two 
disjoint compact sets, there exists a continuous function 0 < g(x) < 1 on 
R d which takes the value 0 on K 0 and 1 on K\. Indeed, if d(x, f2) denotes 
the distance from x to fi, we see that 

d(x,K 0 ) 

d(x,K 0 ) + d(x,Ki) 

has the required properties. 

Now, we may assume without loss of generality that / is non-negative 
and bounded by 1 on T. Let 

Kq = {x G r : 2/3 < f(x) < 1} and K\ = {x G T : 0 < f(x) < 1/3}, 


so that Ko and K\ are disjoint. Clearly, the observation before the 
lemma guarantees that there exists a function 0 < G\{x) < 1/3 on M. d 
which takes the value 1/3 on Ko and 0 on Ki. Then we see that 

0 < f(x) — Gi(x) < ^ for all x eT. 

o 


We now repeat the argument with / replaced by / — Gi. In the first 
step, we have gone from 0</<lto0</ — Gi< 2/3. Consequently, 
we may find a continuous function on so that 


0 < f(x) - Gi(x) - G 2 {x) < 


on r, 


and 0 < G 2 < Repeating this process, we find continuous functions 
G n on M, d such that 


0 < f(x) - Gi(x) - G n (x) < 



on r, 
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and 0 < Gn < | (f)^ 1 on If we define 

oo 

n=l 

then G is continuous and equals / on T. 

To complete the proof of Lemma 4.10, we argue as follows. We regu¬ 
larize the function G obtained in Lemma 4.11 by defining 

F e {x) = e~ d [ G{x-y)(p{y/t')dy= f G(y)tp e (x - y) dy, 

JR d JR d 

with (p e (y) = e~ d ^p{y/e)^ where p is a non-negative Cq° function sup¬ 
ported in the unit ball with f cp(y) dy = 1. Then each F e is a C°° func¬ 
tion. However, 

F e (x) - G(x ) 二 j(G{y) - G(x))tp e (x - y) dy. 

Since the integration above is restricted to \x — y\ < e, then if x G T, we 
see that 

|_F e (;c) — G(x)| < sup \G{x) - G{y)\ f ip e (x - y) dy 
\x-y\<e J 

< sup \G(x) - G{y)\. 

\x-y\<e 

The last quantity tends to zero with e by the uniform continuity of G 
near r, and if we choose e = 1 /n we obtain our desired sequence. 

The two-dimensional theorem 

We now take up the problem of whether the proposed solution u takes 
on the desired boundary values. Here we limit our discussion to the case 
of two dimensions for the reason that in the higher dimensional situation 
the problems that arise involve a number of questions that would take 
us beyond the scope of this book. In contrast, in two dimensions, while 
the proof of the result below is a little tricky, it is within the reach of the 
Hilbert space methods we have been illustrating. 

The Dirichlet problem can be solved (in two dimensions as well as 
in higher dimensions) only if certain restrictions are made concerning 
the nature of the domain ft. The regularity we shall assume, while not 
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optimal, 15 is broad enough to encompass many applications, and yet 
has a simple geometric form. It can be described as follows. We fix an 
initial triangle To in R 2 . To be precise, we assume that To is an isosceles 
triangle whose two equal sides have length and make an angle a at 
their common vertex. The exact values of £ and a are unimportant; 
they may both be taken as small as one wishes, but must be kept fixed 
throughout our discussion. With the shape of To thus determined, we 
say that T is a special triangle if it is congruent to To, that is, T arises 
from To by a translation and rotation. The vertex of T is defined to be 
the intersection of its two equal sides. 

The regularity property of f] we assume, the outside-triangle con¬ 
dition, is as follows: with £ and a fixed, for each x in the boundary of 
f], there is a special triangle with vertex x whose interior lies outside f]. 
(See Figure 5.) 




Figure 5. The triangle To and the special triangle T 


Theorem 4.12 Let f] be an open bounded set in M 2 that satisfies the 
outside-triangle condition. If f is a continuous function on d^l, then the 
boundary value problem Au = 0 with u continuous in O and u\qq = f is 
always uniquely solvable. 

Some comments are in order. 

(1) If Q is bounded by a polygonal curve, it satisfies the conditions of 
the theorem. 

(2) More generally, if Q is appropriately bounded by finitely many Lips- 
chitz curves, or in particular C 1 curves, the conditions are also satisfied. 

(3) There are simple examples where the problem is not solvable: for 
instance, if Q is the punctured disc. This example of course does not 
satisfy the outside-triangle condition. 


15 The optimal conditions involve the notion of capacity of sets. 
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(4) The conditions on in this theorem are not optimal: one can con¬ 
struct examples of when the problem is solvable for which the above 
regularity fails. 

For more details on the above, see Exercise 19 and Problem 4. 

We turn to the proof of the theorem. It is based on the following 
proposition, which may be viewed as a refined version of Lemma 4.9 
above. 

Proposition 4.13 For any bounded open set n in M 2 that satisfies the 
outside-triangle condition there are two constants Ci < 1 and C 2 > 1 such 
that the following holds. Suppose z is a point in Q whose distance from 
dfl is 5. Then whenever v belongs to C l {Q) and v\qq = 0, we have 

(32) / |i;(x)| 2 dx < C5 2 / \Vv(x)\ 2 dx. 

J B Cl s(z) 

The bound C can be chosen to depend only on the diameter of Q, and the 
parameters l and a which determine the triangles T. 



Figure 6. The situation in Proposition 4.13 


Let us see how the proposition proves the theorem. We have already 
shown that it suffices to assume that / is the restriction to of an 
F that belongs to C 1 (f2). We recall we had the minimizing sequence 
u n = F — v n , with v n G C 1 (0) and v n \dn = 0. Moreover, this sequence 
converges in the norm of 7i and L 2 (fi) to a limit such that u = F — v 
is harmonic in Then since (32) holds for each v n , it also holds for 
v = F — u •’ that is, 

(33) f \{F- u)(x)\ 2 dx < C6 2 f |V(F- u){x)\ 2 dx. 

J B ci s(z) «/s C2< 5( ； z)nn 
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To prove the theorem it suffices, in view of the continuity of u in fi, to 
show that if y is any fixed point in dfl, and 2 ： is a variable point in fi, 
then u(z) —^ f(y) as z ^ y. Let S = 5(z) denote the distance of z from 
the boundary. Then 5(z) < jz — y\ and therefore 5(z) — > 0 as ^ 

We now consider the averages of F and u taken over the discs centered 
at 2 ： of radius ci5(z) (recall that c\ < 1). We denote these averages 
by Av(F)(z) and Av(u)(z), respectively. Then by the Cauchy-Schwarz 
inequality, we have 

|Av(F)(z) - Av(u)(z)l 2 < 1 2 [ \F-u\ 2 dx, 

ncisy J Bci5 (z)nQ 

which by (33) is then majorized by 


C' [ \V{F-u)\ 2 dx. 

jB C2S (z)nn 

The absolute continuity of the integral guarantees that the last integral 
tends to zero with 5, since m(B C2 s) —^ 0. However, by the mean-value 
property, Av(u)(z) = u(z), while by the continuity of F in 


Av(F)(z) = — , J / 、、 [ F(x) dx 4 
V A ； m(B Cl5 (z)) J Bcis(z) V ； 

because F\qq = f and z — y. Altogether this gives u(z) 
theorem is proved, once the proposition is established. 


f(y), 

—> f(y), and the 


To prove the proposition, we construct for each z G whose distance 
from is 5, and for 5 sufficiently small, a rectangle R with the following 
properties: 


(1) R has side lengths 2c\5 and M5 (with ci < 1/2, M < 4). 

(2) B ClS (z) C R. 

(3) Each segment in i?, that is parallel to and of length equal to the 
length of the long side, intersects the boundary of fi. 

To obtain R we let y be a point in Oil so that 5 = \z — y\, and we apply 
the outside-triangle condition at y. As a result, the line joining z with 
y and one of the sides of the special triangle whose vertex is at y must 
make an angle /3 < 7r. (In fact /3 < 7r — a/2, as is easily seen.) Now after 
a suitable rotation and translation we may assume that y = 0 and that 
the angle going from the X 2 -axis to the line joining 2 ： to 0 is equal to the 
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Figure 7. Placement of the rectangle R 


angle of the side of the triangle to the a ： 2 -axis. This angle can be taken 
to be 7 , with 7 > a/4. (See Figure 7.) 

There is an alternate possibility that occurs with this figure reflected 
through the X 2 -axis. 

With this picture in mind we construct the rectangle R as indicated 
in Figure 8 . 

It has its long side parallel to the X 2 -axis, contains the disc B Cl s(z), 
and every segment R parallel to the ^ 2 -axis intersects the (extension) of 
the side of the triangle. 

Note that the coordinates of z are (—5 sin 7 ,5 cos 7 ). We choose ci < 
sin 7 , then B Cl s(z) lies in the same (left) half-plane as z. 

We next focus our attention on two points: Pi, which lies on the x\- 
axis at the intersection of this axis with the far side of the rectangle; and 
P 2 , which is at the corner of that side of the rectangle, that is, at the 
intersection of the (continuation) of the side of the outside triangle and 
the further side of the rectangle. The coordinates of Pi are (—a, 0), where 
a = 5c\ + S sin 7 . The coordinates of P 2 are (—a, —a^^). Note that the 
distance of P 2 from the origin is a/ sin 7 , which is 5 + c\5/ sin 7 < 25, 
since ci < sin 7 . 

Now we observe that the length of the larger side of the rectangle is 
the sum of the part that lies above the xi-axis and the part that lies 
below. The upper part has length the sum of the radius of the disc plus 
the height of 之 ， and this is ciS + 5 cos 7 < 25. The lower part has length 
equal to a/tan 7 , which is 6 cos 7 + < 25, since c\ < sin 7 . Thus 
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Figure 8. The disc B Cl s(z) and the rectangle R containing it 


we find that the length of the side is < 45. 

Now it is clear from the construction that each vertical segment in R 
starting from the disc B Cl s(z) when continued downward and parallel to 
the a ； 2 -axis intersects the line joining 0 to P 2 , (which is a continuation 
of the side of the triangle). Moreover, if the length £ of this side of the 
triangle exceeds the distance of P 2 from the origin, then the segment in¬ 


tersects the triangle. When this intersection occurs the segment starting 
from B C 2 s(z) must also intersect the boundary of O, since the triangle 
lies outside f]. Therefore if £ > 26 the desired intersection occurs, and 
each of the conclusions (1), (2), and (3) are verified. (We shall lift the 
restriction 5 幺 "2 momentarily.) 


Now we integrate over each line segment parallel to the ^ 2 -axis in R, 
including its portion in B Cl s(z), which is continued downward until it 
meets dQ. Call such a segment Then, using (30) we see that 




\v{xi^x 2 )\ 2 dx 2 < M 2 5 2 




^ {X1 ' X2) 


dx2 


and an integration in x\ gives 



|^(x)| 2 dx < M6 2 



|V^(x)| 2 dx. 
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However, we note that B Cl s(z) C i?, and B C2 s(z) D R when C 2 > 2. Thus 
the desired inequality (32) is established, still under the assumption that 
5 is small, that is, 5 < £/2. When 5 > t/2 it suffices merely to use the 
crude estimate (29) and the proposition is then proved. The proof of the 
theorem is therefore complete. 

5 Exercises 

1. Suppose / G L 2 (R d ) and k G L 1 (R <i ). 

(a) Show that (/ * k)(x) = f f{x — y)k(y) dy converges for a.e. x. 

(b) Prove that \\f ^ k\\ L 2 (Rd) < II/IL 2 ㈣ _| Ll(Rd) . 

(c) Establish (/ * /c)(^)= 右⑹ / ⑹ for a.e. 

(d) The operator Tf = f ^ k is a Fourier multiplier operator with multiplier 


爪⑹ =HO- 


[Hint: See Exercise 21 in Chapter 2.] 

2. Consider the Mellin transform defined initially for continuous functions / of 
compact support in ]R + = {t G M : t > 0} and x G M by 



Prove that (2n)~ 1//2 A4 extends to a unitary operator from L 2 (R + , dt/t) to L 2 (R). 
The Mellin transform serves on R+, with its multiplicative structure, the same 
purpose as the Fourier transform on R, with its additive structure. 

3. Let F(z) be a bounded holomorphic function in the half-plane. Show in two 
ways that lim y _,o F(x + iy) exists for a.e. x. 

(a) By using the fact that F(z)/(z + i) is in i/ 2 (R^_). 

(b) By noting that G(z) = F is a bounded holomorphic function in the 

unit disc, and using Exercise 17 in the previous chapter. 

4. Consider F(z) = e^ z / (z + i) in the upper half-plane. Note that F{x + iy) G 

L 2 (R), for each y > 0 and y = 0. Observe also that F{z) 0 as |a：| 0. However, 

F ^ H 2 (R 2 + ). Why? 

5. For a < 6, let S a ,b denote the strip {z = x -\- iy^ a < y < b}. Define H 2 (S a ,b) 
to consist of the holomorphic functions F in S a ,b so that 
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Define H 2 (S a ,oo) and H 2 (S~oo,b) to be the obvious variants of the Hardy spaces 
for the half-planes {z = x iy, y > a} and {z = x iy, y < b}, respectively. 

(a) Show that F G H 2 (S a ,b) if and only if F can be written as 

F(z)= f 士 （ dt 

Jr 

with f s |/( 0 | 2 (e 4 ^ + e 4 啦）处 < 00. 

(b) Prove that every F G H 2 (S a ,b) can be decomposed as F = Gi + G 2 , where 
G e H 2 (Sa,oo) and G 2 G H 2 (S-oo,b). 

(c) Show that lim a < y <b,y^a F(x + iy) = F a (x) exists in the L 2 -norm and also 
almost everywhere, with a similar result for \im a<y< b, y ^b F{x + iy). 


6. Suppose Q is an open set in C = R 2 , and let Ti be the subspace of L 2 (Q) 
consisting of holomorphic functions on Q. Show that is a closed subspace of 
L 2 (f2), and hence is a Hilbert space with inner product 

(/ ， 9) = f(z)g(z)dx dy, where z = x-\-iy. 

Jci 

[Hint: Prove that for / G 7i, we have \f(z)\ < d ( 2 C Q C ) ||/|| for 0 G where c = 
7r—" 2 , using the mean-value property (9). Thus if {/ n } is a Cauchy sequence in 
7i, it converges uniformly on compact subsets of Q.] 


7. Following up on the previous exercise, prove: 


(a) If {Wn}SLo is an orthonormal basis of 7i, then 


i^«( z )i 2 < 

n=0 


c 2 

d(z, Q c ) 


for 2 : G 


(b) The sum 

00 

B(Z,W) = ^2 ^Pn{z)^,(w) 
n=0 

converges absolutely for (z, w) E Q x Q, and is independent of the choice of 
the orthonormal basis {(^ n } of Ti. 

(c) To prove (b) it is useful to characterize the function B(z, w), called the 
Bergman kernel, by the following property. Let T be the linear transfor¬ 
mation on L 2 (fl) defined by 

T f (z) = I B(z, w)f(w) dudv, w = u-\- iv. 

J n 

Then T is the orthogonal projection of L 2 (f2) to Ti. 
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(d) Suppose that Q, is the unit disc. Then f £TL exactly when f(z) = a n2 n ， 

with 


|a n | 2 (n + 1) _1 < oo. 

n=0 


Also, the sequence { 2 ([t 1 ) }^L。is an orthonormal basis of Ti. Moreover, 
in this case 


B(z, w)= 


1 

7r(l — zw) 2 


8. Continuing with Exercise 6, suppose H is the upper half-plane R+. Then every 
/ G has a representation 

(34) f(z) = V^TT f /o(0 27r 心炎， 2 ： G R+, 

Jo 

where / 0 °° |/o ⑹ | 2 f < 00 . Moreover, the mapping fo — f given by (34) is a uni¬ 
tary mapping from L 2 ((0, oo), to Ti. 

9. Let H be the Hilbert transform. Verify that 

(a) H* = —H, H 2 = —and H is unitary. 

(b) If Th denotes the translation operator, Th(f)(x) = f(x — h), then H com¬ 
mutes with Th, thH = Hth- 

(c) If 5 a denotes the dilation operator, 5 a (f)(x) = f(ax) with a > 0, then H 
commutes with S a , S a H = Hd a . 

A converse is given in Problem 5 below. 

10. Let / G L 2 (R) and let u(x, y) be the Poisson integral of /, that is ti = (/ * 
V y )(x), as given in (10) above. Let v(x,y) = (Hf * V y )(x), the Poisson integral of 
the Hilbert transform of /. Prove that: 

(a) F{x + iy) = u(x, y) + iv(x^ y) is analytic in the half-plane R+, so that u and 
v are conjugate harmonic functions. We also have / = lim y —o u(x, y) and 
Hf = \im y ^ 0 v(x,y). 

(b) F(z) = ^f R f(t)^. 

(c) v(x, y) = f * Qy, where Q y (x) = ^ is the conjugate Poisson kernel. 

[Hint: Note that ^ = ^y( x ) + z = x iy.] 


11. Show that 


r" 2 (i + z) \i z^ 
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is an orthonormal basis of 

Note that 1 7r i/ 2 ^ +a; ) | is an orthonormal basis of L 2 (R); see Exer¬ 

cise 9 in the previous chapter. 

[Hint: It suffices to show that if F G H 2 (M.\) and 

Loo (^x-i'r+ 1 dx = 0 for n = 0, 1, 2,.. 

then F = 0. Use the Cauchy integral formula to prove that 

(&) (^( z )( z + ^) n )\z=i = 0, 

and thus F( n )(i) = 0 for n = 0, 1, 2, ••"] 

12. We consider whether the inequality 

\MlHq) < c||L(w)|| l2(q) 

can hold for open sets Q, that are unbounded. 

(a) Assume d> 2. Show that for each constant coefficient partial differential 
operator L, there are unbounded connected open sets for which the above 
holds for all u G 

(b) Show that $ c||L(ti)|| L 2 ( R d) for all u G Co°(R d ) if and only if 

|P(^)| > c > 0 all where P is the characteristic polynomial of L. 

[Hint: For (a) consider first L = {d/dx\) n and a strip {x : —1 < xi < 1}.] 

13. Suppose L is a linear partial differential operator with constant coefficients. 
Show that when d> 2, the linear space of solutions u of L{u) = 0 with u G (7°°(R d ) 
is not finite-dimensional. 

[Hint: Consider the zeroes ( of P(^), C ^ where P is the characteristic poly¬ 
nomial of L.] 

14. Suppose F and G are two integrable functions on a bounded interval [a, b]. 
Show that G is the weak derivative of F if and only if F can be corrected on a set 
of measure 0, such that F is absolutely continuous and F’（x) = G(x) for almost 
every x. 

[Hint: If G is the weak derivative of F, use an approximation to show that 

nb nb 

/ G{x)(p{x)dx = — F{x)(p r {x)dx 


holds for the function (p illustrated in Figure 9.] 
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15. Suppose f G L 2 (R d ). Prove that there exists g G L 2 (R d ) such that 

(基）伽=咖） 

in the weak sense, if and only if 

(27T^) a /(C) = m e L 2 (R d ). 


16. Sobolev embedding theorem. Suppose n is the smallest integer > d/2. If 


/ 6 L 2 {R d ) and 



/ € L 2 {R d ) 


in the weak sense, for all 1 < |a| < n, then / can be modified on a set of measure 
zero so that / is continuous and bounded. 

[Hint: Express / in terms of /, and show that / G L 1 ^^) by the Cauchy-Schwarz 
inequality.] 


IT. The conclusion of the Sobolev embedding theorem fails when n = d/2. Con¬ 
sider the case d = 2, and let /(x) = (log l/\x\) a r](x), where ry is a smooth cut¬ 
off function with ij = 1 for x near the origin, but r}(x) = 0 if |x| > 1/2. Let 
0 < a < 1/2. 

(a) Verify that df /dxi and df / dx 2 are in L 2 in the weak sense. 

(b) Show that / cannot be corrected on a set of measure zero such that the 
resulting function is continuous at the origin. 


18. Consider the linear partial differential operator 

L = 〉: doc 
|o：| <n 
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Then 


增 =E M2 吨 r 

|o：| <n 


is called the characteristic polynomial of L. The differential operator L is said 
to be elliptic if 

|P(^)| > c|f | n for some c > 0 and all ^ sufficiently large. 

(a) Check that L is elliptic if and only if ^| Q： |_ n a a (2n^) a vanishes only when 
^ = 0. 一 ' —’ 

(b) If L is elliptic, prove that for some c > 0 the inequality 

{\\L(f\\ L 2^ R d' ) + |M|_L 2 (IR d )) 

L 2 (E d ) 

holds for all G and |a| < n. 

(c) Conversely, if (b) holds then L is elliptic. 



19. Suppose u is harmonic in the punctured unit disc D* = { 2 : G C : 0 < |z| < 1}. 

(a) Show that if u is also continuous at the origin, then u is harmonic throughout 
the unit disc. 

[Hint: Show that u is weakly harmonic.] 

(b) Prove that the Dirichlet problem for the punctured unit disc is in general 
not solvable. 


20. Let F be a continuous function on the closure D of the unit disc. Assume that 
F is in C 1 on the (open) disc D, and |VF| 2 < 00 . 

Let /(e* 0 ) denote the restriction of F to the unit circle, and write f{e xd ) ^ 
J2n=-oo a ne ine . Prove that Z ^ —加 M |a n | 2 < 00 . 

[Hint: Write F(re l0 ) ~ J2^L-oo F n (r)e inG , with 凡⑴ =a n . Express f B |VF| 2 in 
polar coordinates, and use the fact that 

f 1 \F'(r)\ 2 dr + L f \F(r)\ 2 dr, 

Z Jl/2 Jl/2 

for L > 2; apply this to F = F n , L = |n|.] 
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6 Problems 

1. Suppose Fo(x) G L 2 (R). Then a necessary and sufficient condition that there 
exists an entire analytic function F, such that |F( 2 ；)| < Ae a ^ for all ^ G C, and 
Fq(x) = F{x) a.e. x G R, is that Fo(0 = 0 whenever 旧 > a/27r. 

[Hint: Consider the regularization F e (z) = F(z — dt and apply to it 

the considerations in Theorem 3.3 of Chapter 4 in Book II.] 

2. Suppose Q, is an open bounded subset of R 2 . A boundary Lipschitz arc 7 is 
a portion of dfl which after a rotation of the axes is represented as 

7 = {(^ 1 ,^ 2 ) : X2 = a < Xi < 6 }, 

where a < b and 7 C dfi. It is also supposed that 

(35) |”(xi) — ” (4)1 £ M\xi — x[\, whenever xi,xi G [a, 6], 

and moreover if 75 = {{x\^X 2 ) : X 2 — S < r)(x\) < X 2 }, then 75 H D = 0 for some 
<5 > 0. (Note that the condition (35) is satisfied if ry G C 1 ([a, b]).) 

Suppose Q satisfies the following condition. There are finitely many open discs 
Di, D 2 , ..., Dn with the property that IJ^ Dj contains dfl and for each dQ fl Dj 
is a boundary Lipschitz arc (see Figure 10). Then fl verifies the outside-triangle 
condition of Theorem 4.12, guaranteeing the solvability of the boundary value 
problem. 



Figure 10. A domain with boundary Lipschitz arcs 


3.* Suppose the bounded domain Q has as its boundary a closed simple continuous 
curve. Then the boundary value problem is solvable for Q. This is because there 




260 


Chapter 5. HILBERT SPACES: SEVERAL EXAMPLES 


exists a conformal map $ of the unit disc ID) to D that extends to a continuous 
bijection from D to Q. (See Section 1.3 and Problem 6 * in Chapter 8 of Book II.) 

4. Consider the two domains 17 in IR 2 given by Figure 11. 



Domain I 



Figure 11. Domains with a cusp 


The set I has as its boundary a smooth curve, with the exception of an (inside) 
cusp. The set II is similar, except it has an outside cusp. Both I and II fall 
within the scope of the result of Problem 3, and hence the boundary value problem 
is solvable in each case. However, II satisfies the outside-triangle condition while 
I does not. 

5. Let T be a Fourier multiplier operator on L 2 (R d ). That is, suppose there 
is a bounded function m such that (T/)(^) = m ⑹ / ⑹， all / G L 2 (R d ). Then T 
commutes with translations, r^T = Tr^, where Th(f)(x) = f(x — h), for all h G M d . 

Conversely any bounded operator on L 2 (R d ) that commutes with translations 
is a Fourier multiplier operator. 

[Hint: It suffices to prove that if a bounded operator T commutes with multiplica¬ 
tion by exponentials e 27r ^'^, h G then there is an m so that Tg^) = 
for all g G L 2 (R d ). To do this, show first that 

= ^T(g), all g e L 2 (R d ), whenever $ G CS°(R d ). 

Next, for large N, choose $ so that it equals 1 in the ball |^| < N. Then m(f)= 
f ($)(0 for |e| < N] 

As a consequence of this theorem show that if T is a bounded operator on L 2 (IR) 
that commutes with translations and dilations (as in Exercise 9 above), then 

(a) If (Tf)(—x) = T(f(—x)) it follows T = cl, where c is an appropriate con¬ 
stant and I the identity operator. 

(b) If (Tf)(—x) = —T(/(—x)), then T = cH, where c is an appropriate constant 
and H the Hilbert transform. 


6. This problem provides an example of the contrast between analysis on L 1 (R d ) 
and L 2 (R d ). 
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Recall that if / is locally integrable on R d , the maximal function /* is defined 
by 


r ⑷ 





\f(y)\dy, 


where the supremum is taken over all balls containing the point x. 

Complete the following outline to prove that there exists a constant C so that 


ll/*llL 2 (M d ) ^ C||/IL 2 (]R d ). 

In other words, the map that takes f to f* (although not linear) is bounded 
on L 2 (R d ). This differs notably from the situation in L 1 (IR £i ), as we observed in 
Chapter 3. 

(a) For each a > 0, prove that if / G L 2 (R d ), then 

m({x : f*(x) > a}) < ^ f |/(x)| dx. 

a … / l >«/2 

Here, A = 3 d will do. 

[Hint: Consider fi(x) = f(x) if |/(a:)| > a/2 and 0 otherwise. Check that 
/i G L'R,, and 

{x : f*(x) > a} C {x : fi (x) > a/2}.] 


(b) Show that 



POO 

\f*(x)\ 2 dx = 2 / am(E a )da, 

Jo 


where E a = {a: : f*(x) > a}. 

(c) Prove that < C\\f\\ L 2 (R dy 



Abstract Measure and 
Integration Theory 


What immediately suggest itself, then, is that these 
characteristic properties themselves be treated as the 
main object of investigation, by defining and dealing 
with abstract objects which need satisfy no other con¬ 
ditions than those required by the very theory to be 
developed. 

This procedure has been made use of — more or 
less consciously — by mathematicians of every era. 
The geometry of Euclid and the literal algebra of the 
sixteenth and seventeenth centuries arose in this way. 
But only in more recent times has this method, called 
the axiomatic method, been consistently developed 
and carried through to its logical conclusion. 

It is our intention to treat the theories of measure 
and integration by means of the axiomatic method just 
described. 

C. Caratheodory, 1918 


In much of mathematics integration plays a significant role. It is used, 
in one form or another, when dealing with questions that arise in analysis 
on a variety of different spaces. While in some situations it suffices to 
integrate continuous or other simple functions on these spaces, the deeper 
study of a number of other problems requires integration based on the 
more refined ideas of measure theory. The development of these ideas, 
going beyond the setting of the Euclidean space is the goal of this 
chapter. 

The starting point is a fruitful insight of Caratheodory and the re¬ 
sulting theorems that lead to construction of measures in very general 
circumstances. Once this has been achieved, the deduction of the fun¬ 
damental facts about integration in the general context then follows a 
familiar path. 

We apply the abstract theory to obtain several useful results: the 
theory of product measures; the polar coordinate integration formula, 
which is a consequence of this; the construction of the Lebesgue-Stieltjes 
integral and its corresponding Borel measure on the real line; and the 
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general notion of absolute continuity. Finally, we treat some of the basic 
limit theorems of ergodic theory. This not only gives an illustration of 
the abstract framework we have established, but also provides a link with 
the differentiation theorems studied in Chapter 3. 

1 Abstract measure spaces 

A measure space consists of a set X equipped with two fundamental 
objects: 

(I) A cr-algebra M. of “measurable” sets, which is a non-empty col¬ 
lection of subsets of X closed under complements and countable 
unions and intersections. 

(II) A measure // : — [0, oo] with the following defining property: 

if Ei, £^ 2 ,... is a countable family of disjoint sets in A4, then 



A measure space is therefore often denoted by the triple to em¬ 

phasize its three main components. Sometimes, however, when there is 
no ambiguity we will abbreviate this notation by referring to the measure 
space as (X,/n), or simply X. 

A feature that a measure space often enjoys is the property of being 
a- finite. This means that X can be written as the union of countably 
many measurable sets of finite measure. 

At this early stage we give only two simple examples of measure spaces: 

(i) The first is the discrete example with X a countable set, X = 

-M the collection of all subsets of X, and the measure 
fi determined by /i(x n ) = " n , with {// n }^ =1 a given sequence of 
(extended) non-negative numbers. Note that = ^2 XneE /^n- 

When /i n = 1 for all n, we call /i the counting measure, and also 
denote it by #. In this case integration will amount to nothing but 
the summation of (absolutely) convergent series. 

(ii) Here X = R d , M is the collection of Lebesgue measurable sets, and 
fi(E) = f E fdx, where / is a given non-negative measurable func¬ 
tion on R d . The case f = 1 corresponds to the Lebesgue measure. 
The countable additivity of [i follows from the usual additivity and 
limiting properties of integrals of non-negative functions proved in 
Chapter 2. 
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The construction of measure spaces relevant for most applications require 
further ideas, and to these we now turn. 

1.1 Exterior measures and Caratheodory’s theorem 

To begin the construction of a measure and its corresponding measurable 
sets in the general setting requires, as in the special case of Lebesgue mea¬ 
sure considered in Chapter 1, a prerequisite notion of “exterior” measure. 
This is defined as follows. 

Let X be a set. An exterior measure (or outer measure) on 
X is a function /i* from the collection of all subsets of X to [0, oo] that 
satisfies the following properties: 


(i) "*(0)=0. 

(ii) If Ei C E 2 , then 

(iii) If 五 i, 五 2, … is a countable family of sets, then 



For instance, the exterior Lebesgue measure m* in defined in Chap¬ 
ter 1 enjoys all these properties. In fact, this example belongs to a 
large class of exterior measures that can be obtained using “coverings” 
by a family of special sets whose measures are taken as known. This 
idea is systematized by the notion of a “premeasure” taken up below in 
Section 1.3. A different type of example is the exterior a-dimensional 
Hausdorff measure m* defined in Chapter 7. 

Given an exterior measure /i*, the problem that one faces is how to de¬ 
fine the corresponding notion of measurable sets. In the case of Lebesgue 
measure in M. d such sets were characterized by their difference from open 
(or closed) sets, when considered in terms of /i*. For the general case, 
Caratheodory found an ingenious substitute condition. It is as follows. 

A set 五 in X is Caratheodory measurable or simply measurable 
if one has 

(1) ^(A) = /n^(E D A) + fl A) for every A C X. 

In other words, E separates any set A in two parts that behave well 
in regard to the exterior measure /i*. For this reason, (1) is sometimes 
referred to as the separation condition. One can show that in with the 
Lebesgue exterior measure the notion of measurability (1) is equivalent 
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to the definition of Lebesgue measurability given in Chapter 1. (See 
Exercise 3.) 

A first observation we make is that to prove a set E is measurable, it 
suffices to verify 

> "*( 五 fl A) + fl A) for all A C X, 

since the reverse inequality is automatically verified by the sub-additivity 
property (iii) of the exterior measure. We see immediately from the 
definition that sets of exterior measure zero are necessarily measurable. 

The remarkable fact about the definition (1) is summarized in the next 
theorem. 

Theorem 1.1 Given an exterior measure /i* on a set X, the collection 
Ai of Caratheodory measurable sets forms a a-algebra. Moreover, /i* 
restricted to M. is a measure. 

Proof. Clearly, 0 and X belong to M. and the symmetry inherent 
in condition (1) shows that E c ^ M. whenever E G A4. Thus M. is non¬ 
empty and closed under complements. 

Next, we prove that M. is closed under finite unions of disjoint sets, 
and is finitely additive on M. Indeed, if Ei,E 2 G M, and A is any- 
subset of X, then 

n A) + n A) 

= ii^E\ n E 2 n A) + n 五 2 n ^ 4 )+ 

+ n E^n A)r\E^nA) 

^ "*(( 五 1 u 五 2) n A) + "*(( 五 i u 五 2) c n A )， 

where in the first two lines we have used the measurability condition 
on E <2 and then Ei, and where the last inequality was obtained using 
the sub-additivity of ^ and the fact that EiU E 2 = (Ei fl E 2 ) U fl 
E 2 ) U (Ei fl E 2 ). Therefore, we have EiU E 2 E M, and if E 1 and E 2 are 
disjoint, we find 

^{E 1 u e 2 ) = //* (^1 n (^i u 丑 2)) + /x* (坷 n u e 2 )) 

=+ a*(_E2). 

Finally, it suffices to show that M is closed under countable unions of 
disjoint sets, and that /x* is countably additive on Ai. Let E 2 , … 
denote a countable collection of disjoint sets in M, and define 

n 00 

G n = Ej and G = Ej. 
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For each n, the set G n is a finite union of sets in A4, hence G n G M. 
Moreover, for any A C X we have 

fu(G n nA) = ^(E n n{G n DA))^ //*( 五 g n (Gn n A)) 

=n A) + n A) 

n 

= 〉: n A), 

j=i 

where the last equality is obtained by induction. Since we know that 
G n G M, and G c C G c n ^ we find that 

n 

"*(A) = ii^{G n n A) + /i* (G^ n A) ^ 〉: (^Ej n A) + "*(G C n _A). 

j=i 

Letting n tend to infinity, we obtain 

oo 

2 〉: //* (_E^ n _A) + "*(G C n _A) 2 "*(G n A) + "*(G C n A) 

3=1 

> "*(欠). 


Therefore all the inequalities above are equalities, and we conclude that 
G G M, as desired. Moreover, by taking A = G in the above, we find 
that /i* is countably additive on J\4, and the proof of the theorem is 
complete. 

Our previous observation that sets of exterior measure 0 are Caratheodory 
measurable shows that the measure space (X,J\4,/n) in the theorem 
is complete: whenever F E M. satisfies "(F) = 0 and E C F, then 
E e M. 


1.2 Metric exterior measures 

If the underlying set X is endowed with a “distance function” or u met- 
ric,” there is a particular class of exterior measures that is of interest in 
practice. The importance of these exterior measures is that they induce 
measures on the natural cr-algebra generated by the open sets in X. 

A metric space is a set X equipped with a function d : X x X ^ 
[0, oo) that satisfies: 


(i) d(x, y) = 0 if and only if x = y. 


(ii) d(x, y) = d(y, x) for all x,y E X. 
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(iii) d(x, z) < d(x, y) + d{y, z), for all z G X. 

The last property is of course called the triangle inequality, and a func¬ 
tion d that satisfies all these conditions is called a metric on X. For 
example, the set with d{x^ y) = \x — y\ is a metric space. Another 
example is provided by the space of continuous functions on a compact 
set K with d(f,g) = sup xe ^ \f(x) - g{x)\. 

A metric space (X, d) is naturally equipped with a family of open balls. 
Here 


B r (x) = {y 6 X : d(x,y) < r} 

defines the open ball of radius r centered at x. Together with this, we say 
that a set (9 C X is open if for any x E O there exists r > 0 so that the 
open ball B r {x) is contained in O. A set is closed if its complement is 
open. With these definitions, one checks easily that an (arbitrary) union 
of open sets is open, and a similar intersection of closed sets is closed. 

Finally, on a metric space X we can define, as in Section 3 of Chapter 1, 
the Borel cr-algebra, Bx, that is the smallest cr-algebra of sets in X 
that contains the open sets of X. In other words Bx is the intersection 
of all <7-algebras that contain the open sets. Elements in Bx are called 
Borel sets. 

We now turn our attention to those exterior measures on X with the 
special property of being additive on sets that are “well separated.” We 
show that this property guarantees that this exterior measure defines a 
measure on the Borel cr-algebra. This is achieved by proving that all 
Borel sets are Caratheodory measurable. 

Given two sets A and B in a metric space (X, d), the distance between 
A and B is defined by 

d(A, B) = mf{d(x, y) : x E A and y G B}. 

Then an exterior measure on X is a metric exterior measure if it 
satisfies 


U B) = ^(A) + whenever d(A^ B) > 0. 

This property played a key role in the case of exterior Lebesgue measure. 

Theorem 1.2 If fi* is a metric exterior measure on a metric space X, 
then the Borel sets in X are measurable. Hence /i* restricted to Bx ^ a 


measure. 
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Proof. By the definition of Bx it suffices to prove that closed sets 
in X are Caratheodory measurable. Therefore, let F denote a closed set 
and A a subset of X with < oo. For each n > 0, let 

A n = {x e F c (1A : d(x,F) > 1/n}. 

Then A n C A n+1 , and since F is closed we have F c C\ A = U=i Ai. 
Also, the distance between F C\ A and A n is > 1/n, and since is a 

metric exterior measure, we have 

(2) /i*(A) > ^((FnA)uA n ) = M*(Fn A) + ii^A n ). 


Next, we claim that 


(3) 

To see this, let B n = 


lim "*(A n ) = p*(F c n A). 


A n+ i D A c n and note that 
d(B n+1 ,A n ) > n ^ +1 y 


Indeed, if x G B n +i and d(x,y) < l/n(n + 1) the triangle inequality shows 
that d(y,F) < 1/n, hence y ^ A n . Therefore 


and this implies that 

k 

j=l 

A similar argument also gives 

k 

Since ^(A) is finite, we find that both series fj^(B 2 j) and ^ ^(B 2 j-i) 
are convergent. Finally, we note that 


S A) g "*(A n ) + ^2 "*( 巧 ), 
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and this proves the limit (3). Letting n tend to infinity in the inequal¬ 
ity (2) we find that "*(A) > /i*(F fl A) + /i*(F c fl A), and hence F is 
measurable, as was to be shown. 

Given a metric space X, a measure fi defined on the Borel sets of X 
will be referred to as a Borel measure. Borel measures that assign a 
finite measure to all balls (of finite radius) also satisfy a useful regularity 
property. The requirement that < oo for all balls B is satisfied in 

many (but not in all) circumstances that arise in practice. 1 When it does 
hold, we get the following proposition. 

Proposition 1.3 Suppose the Borel measure fi is finite on all balls in 
X of finite radius. Then for any Borel set E and any e > 0 ， there are 
an open set O and a closed set F such that E C O and fi(0 — E) < e, 
while F C E and fj,(E — F) < e. 

Proof. We need the following preliminary observation. Suppose 
F* = (J 二 i Fk, where the are closed sets. Then for any e > 0, we can 
find a closed set F C F* such that /i(F* — F) < e. To prove this we can 
assume that the sets {F^} are increasing. Fix a point xq G X, and let B n 
denote the ball {x : d(x, x 0 ) < n}, with B 0 = {0}. Since U^=i 
we have that 

F* 

Now for each n, F* D (B n — B n -i) is the limit as fc —^ oo of the increasing 
sequence of closed sets Fk fl (B n — so (recalling that B n has finite 

measure) we can find an iV = N(n) so that (F* — •Fjv(n )) 门 [B n — B n -\) 
has measure less than e/2 n . If we now let 

oo 

F = U (-FV(n) n (B n - Bn-l)), 

n=l 

it follows that the measure of F* — F is less that e/2 n = e. We 

also see that F H Bk is closed since it is the finite union of closed sets. 
Thus F itself is closed because, as is easily seen, any set F is closed 
whenever the sets F fl Bk are closed for all k. 

Having established the observation, we call C the collection of all sets 
that satisfy the conclusions of the proposition. Notice first that if E 
belongs to C then automatically so does its complement. 


lr rhis restriction is not always valid for the HausdorfF measures that are considered in 
the next chapter. 
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Suppose now that E = IJ^Li with each Ek G C. Then there are 
open sets Ok, Ok D Ek, with fi(Ok — Ek) < e/2 k . However, if O = 
Ur=i thenO-Ec ~E k ), andso/x((9 - E) < 狀 = 

e. 

Next, there are closed sets Fk C Ek with fi(Ek — Fk) < e/2 k . Thus if 
F* = (J^=i A, we see as before that /i(E — F*) < e. However, F* is not 
necessarily closed, so we can use our preliminary observation to find a 
closed set F C F* with /i(F* — F) < e. Thus fi(E — F) < 2e. Since e is 
arbitrary, this proves that U^Li 及 belongs to C. 

Let us finally note that any open set O is in C. The property regarding 
containment by open sets is immediate. To find a closed F C O, so 
that \i(0 — F) < e, let = {x ^ ' d(x, O c ) > l/k}. Then it is clear 

that each is closed and O = [J^ =1 Fk. We then need only apply the 
observation again to find the required set F. Thus we have shown that C 
is a a-algebra that contains the open sets, and hence all Borel sets. The 
proposition is therefore proved. 


1.3 The extension theorem 

As we have seen, a class of measurable sets on X can be constructed 
once we start with a given exterior measure. However, the definition of 
an exterior measure usually depends on a more primitive idea of measure 
defined on a simpler class of sets. This is the role of a premeasure defined 
below. As we will show, any premeasure can be extended to a measure 
on X. We begin with several definitions. 

Let X be a set. An algebra in X is a non-empty collection of subsets 
of X that is closed under complements, finite unions, and finite intersec¬ 
tions. Let A be an algebra in X. A premeasure on an algebra ^4 is a 
function /i 。： j > [0, oo] that satisfies: 

(i) Mo(0) = 0. 

(ii) If 五 1 , 五 2 , • •. is a countable collection of disjoint sets in A with 
Ufcli E k ^ A, then 


Mo 



= (五 A ：). 
k=l 


In particular, [Iq is finitely additive on A. 


Premeasures give rise to exterior measures in a natural way. 
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Lemma 1.4 If /j，o is a premeasure on an algebra A, define on any 
subset E of X by 

( oo oo 

= inf < •• E C Ej, where Ej G A for all j 

j=i 



Then, /i* is an exterior measure on X that satisfies: 

(i) "*(_©) = (jLq(E) for all E E A. 

(ii) All sets in A are measurable in the sense of (1). 


Proof. Proving that /i* is an exterior measure presents no difficulty. 
To see why the restriction of /i* to A coincides with /ig, suppose that 
E ^ A. Clearly, one always has < /iq(E) since E covers itself. To 

prove the reverse inequality let E C LJ^li Ej, where Ej G A for all j. 
Then, if we set 

( k-l 

^ - U ^ 

j=i 

the sets E’ k are disjoint elements of E r k C and E = H % 
(ii) in the definition of a premeasure, we have 

oo oo 

Mo ⑻ ^ y^Mo(-^fc) < ^2^o{E k )- 

k=l k=l 

Therefore, we find that < //*( 五 )， as desired. 

Finally, we must prove that sets in A are measurable for jli^. Let A 
be any subset of X, E G A, and e > 0. By definition, there exists a 
countable collection 五 i, 五 2, ... of sets in A such that A C U^li Ej and 



oo 

Mo(-E'j) ^ "*(A) + e. 
j=i 


Since /io is a premeasure, it is finitely additive on A and therefore 


。 Ej) + y^^ 0 (E c H Ej) 

j=i j=i j=i 


> ^(EnA)-\-^(E c nA). 
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Since e is arbitrary, we conclude that ^{A) > ^(E D A) ^(E c D A), 
as desired. 

The a-algebra generated by an algebra A is by definition the smallest 
a-algebra that contains A. The above lemma then provides the necessary 
step for extending /io on ^4 to a measure on the cr-algebra generated by 
A. 


Theorem 1.5 Suppose that A is an algebra of sets in X, / j,q a premea¬ 
sure on A, and M. the a-algebra generated by A. Then there exists a 
measure ii on M. that extends [Iq. 

One notes below that \x is the only such extension of fio under the as¬ 
sumption that ii is (j-finite. 

Proof. The exterior measure ^ induced by /j,q defines a measure /i on 
the cr-algebra of Caratheodory measurable sets. Therefore, by the result 
in the previous lemma, fi is also a measure on M. that extends /xq- (We 
should observe that in general the class M. is not as large as the class of 
all sets that are measurable in the sense of (1).) 

To prove that this extension is unique whenever fi is a-finite, we argue 
as follows. Suppose that v is another measure on M. that coincides with 
fio on A, and suppose that F ^ M has finite measure. We claim that 
fi(F) = v{F). If F C [jEj : where Ej G A, then 

oo oo 

v{F) <^(E 3 ) ^ 

j=l 3=1 

so that iy(F) < [x(F). To prove the reverse inequality, note that \i E = 
IJ Ej, then the fact that v and /i are two measures that agree on A gives 

n n 

^(E) = lim i/([ J Ej) = lim /x(l J Ej) — 
j=l j=l 

If the sets Ej are chosen so that ijl{E) < /i(F) + e, then the fact that 
/J-(F) < oo implies /i(E — -F) < e, and therefore 

fi(F) < ii(E) = u{E) = z/(F) + v{E -F)< z/(F) + \x[E - F) 

^ + e. 

Since e is arbitrary, we find that /i(F) < z/(F), as desired. 

Finally, we use this last result to prove that if [i is cr-finite, then \i = 
v. Indeed, we may write X = [J Ej^ where 五 1 , 五 2 , •.. is a countable 
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collection of disjoint sets in A with < oo. Then for any F ^ M we 

have 

m = E MF n 马） u(F), 
and the uniqueness is proved. 

For later use we record the following observation about the premeasure 
fiQ on the algebra A and the resulting measure /i* that is implicit in the 
argument given above. The details of the proof may be left to the reader. 

We define A a as the collection of sets that are countable unions of sets 
in ^4, and as the sets that arise as countable intersections of sets in 
^4-cr- 

Proposition 1.6 For any set E and any e 〉 0 ， there are sets E\ G 
A a and E 2 G A a s, such that E C Ei, E C E 2 , and + e, 

while /i* ( 五 2 ) = f-h(E). 

2 Integration on a measure space 

Once we have established the basic properties of a measure space X, the 
fundamental facts about measurable functions and integration of such 
functions on X can be deduced as in the case of the Lebesgue measure 
on M d . Indeed, the results in Section 4 of Chapter 1 and all of Chapter 2 
go over to the general case, with proofs remaining almost word-for-word 
the same. For this reason we shall not repeat these arguments but limit 
ourselves to the bare statement of the main points. The reader should 
have no difficulty in filling in the missing details. 

To avoid unnecessary complications we will assume throughout that 
the measure space (X ，人 1,/x) under consideration is cr-finite. 


Measurable functions 

A function f on X with values in the extended real numbers is measur¬ 
able if 


/ _1 ([— 00 , a)) = {x E X : f(x) < a} ^ M for all a G M. 

With this definition, the basic properties of measurable functions ob¬ 
tained in the case of M. d with the Lebesgue measure continue to hold. 
(See Properties 3 through 6 for measurable functions in Chapter 1.) For 
instance, the collection of measurable functions is closed under the ba¬ 
sic algebraic manipulations. Also, the pointwise limits of measurable 
functions are measurable. 


274 


Chapter 6. ABSTRACT MEASURE AND INTEGRATION THEORY 


The notion of “almost everywhere” that we use now is with respect to 
the measure For instance, if / and g are measurable functions on X, 
we write f = g a.e. to say that 

n({xeX \ f(x) ^ g(x)}) = 0. 

A simple function on X takes the form 

N 

k=l 

where Ek are measurable sets of finite measure and ak are real numbers. 
Approximations by simple functions played an important role in the defi¬ 
nition of the Lebesgue integral. Fortunately, this result continues to hold 
in our abstract setting. 

• Suppose f is a non-negative measurable function on a measure 
space (X ， Ai ， fj). Then there exists a sequence of simple functions 
Wk]kLi that satisfies 

^ ^Pk+i{ x ) and lim ^(x) = f(x) for all x. 

fc—^oo 

In general, if f is only measurable, there exists a sequence of simple 
functions {(fk}^ = i that satisfies 

Wk{x)\ < \cp k+ i(x)\ and lim (f k {x) = f(x) for all x. 

fc—>oo 


The proof of this result can be obtained with some obvious minor 
modifications of the proofs of Theorems 4.1 and 4.2 in Chapter 1. Here, 
one makes use of the technical condition imposed on X, that of being a- 
finite. Indeed, if we write X = \^j where Fk G Ai are of finite measure, 
then the sets Fk play the role of the cubes Qk in the proof of Theorem 4.1, 
Chapter 1. 

Another important result that generalizes immediately is Egorov’s the¬ 
orem. 

• Suppose {fk}kLi is a sequence of measurable functions defined on 
a measurable set E C X with fi(E) < oo, and fk — f a.e. Then 
for each e > 0 there is a set A e with A e C E, fi(E — A e ) < e, and 
such that fk — f uniformly on A e . 
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Definition and main properties of the integral 

The four-step approach to the construction of the Lebesgue integral that 
begins with its definition on simple functions given in Chapter 2 carries 
over to the situation of a cr-finite measure space (X, , /i). This leads 
to the notion of the integral, with respect to the measure of a non¬ 
negative measurable function / on X. This integral is denoted by 



/ 0 ) dp(x), 


which we sometimes simplify as f x f dfi, f f dfi err f f, when no con¬ 
fusion is possible. Finally, we say that a measurable function / is inte- 
grable if 



|/(x)| dii{x) < oo. 


The elementary properties of the integral, such as linearity and mono- 
tonicity, continue to hold in this general setting, as well as the following 
basic limit theorems. 


(i) Patou’s lemma. If {f n } ^ a sequence of non-negative measurable 
functions on X, then 



lim inf f n dfi < lim inf 

n—^oo n—>-oo 



fn d/1. 


(ii) Monotone convergence. // {/ n } is a sequence of non-negative mea- 
surable functions with f n / f ， then 


lim 

n ― >-oo 



fn 



(iii) Dominated convergence. If {/ n } is a sequence of measurable func¬ 
tions with f n — f a.e.，and such that \f n \ < g for some integrable 
g, then 

J \fn - /| d/x —> 0 as n — oo, 

and consequently 


J fn dll 



as n ^ oo. 
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The spaces L 1 (X, /i) and L 2 、 X, fi) 

The equivalence classes (modulo functions that vanish almost every¬ 
where) of integrable functions on (X, fi) form a vector space equipped 
with a norm. This space is denoted by L 1 {X ， fi) and its norm is 


ll/llii (〜)二 / \f(x)\ d/i(x). 
J X 


⑷ 


Similarly we can define L 2 (X,") to be the equivalence class of measurable 
functions for which f x |/(x)| 2 dfi(x) < oo. The norm is then 



There is also an inner product on this space given by 



The proofs of Proposition 2.1 and Theorem 2.2 in Chapter 2, as well as 
the results in Section 1 of Chapter 4, extend to this general case and 
give: 

• The space L 1 (X ， fi) is a complete normed vector space. 

• The space L 2 (X^) is a (possibly non-separable) Hilbert space. 

3 Examples 

We now discuss some useful examples of the general theory. 

3.1 Product measures and a general Fubini theorem 

Our first example concerns the construction of product measures, and 
leads to a general form of the theorem that expresses a multiple integral 
as a repeated integral, extending the case of Euclidean space considered 
in Section 3 of Chapter 2. 

Suppose (Xi, Ali, /ii) and (^ 2 ,^ 2 , ^ 2 ) are a pair of measure spaces. 
We want to describe the product measure /ii x \i 2 on the space X = 
Xi X X 2 = {(xi,X 2 ) : X\ G Xi, X 2 G X 2 }. 

We will assume here that the two measure spaces are each complete 
and a-finite. 

We begin by considering measurable rectangles: these are subsets 
of X of the form Ax with A and B measurable sets, that is, A G M.\ 
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and B G M 2 - We then let A denote the collection of all sets in X that are 
finite unions of disjoint measurable rectangles. It is easy to check that A 
is an algebra of subsets of X. (Indeed, the complement of a measurable 
rectangle is the union of three disjoint such rectangles, while the union 
of two measurable rectangles is the disjoint union of at most six such 
rectangles.) From now on we abbreviate our terminology by referring to 
measurable rectangles simply as “rectangles.” 

On the rectangles we define the function //o by fJ-o(A x B) = ^{B). 

Now the fact that /io has a unique extension to the algebra A for which 
fiQ becomes a premeasure is a consequence of the following fact: when¬ 
ever a rectangle A x B is the disjoint union of a countable collection of 
rectangles {Aj x Bj}，A x B = UjLi ^-3 x Bj, then 

00 

( 6 ) fi 0 (A x B) = ^2 ^ A j x Bj). 

j=i 

To prove this, observe that if x\ G A, then for each X 2 G B the point 
(^ 1 ?^ 2 ) belongs to exactly one Aj x Bj. Therefore we see that B is the 
disjoint union of the Bj for which Xi G Aj. By the countable additivity 
property of the measure this has as an immediate consequence the 
fact that 


00 

Xa^^B) = ( 巧 ) • 

j=i 

Hence integrating in x\ and using the monotone convergence theorem we 
get Mi(^)/^ 2 (^), which is (6). 

Now that we know that fj，o is a premeasure on A, we obtain from The¬ 
orem 1.5 a measure (which we denote by /n = /j，i x /i 2 ) on the cr-algebra 
M of sets generated by the algebra A of measurable rectangles. In this 
way, we have defined the product measure space (Xi x 人 （Mi x M 2 ). 

Given a set 五 in we shall now consider slices 

E Xl = {x2 G X 2 : {x\^X2) G E} and E X2 = {x\ G X\ : (xi,X2) G E}. 

We recall the definitions according to which A a denotes the collection 
of sets that are countable unions of elements of A, and A a s the sets 
that arise as countable intersections of sets from A a . We then have the 
following key fact. 
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Proposition 3.1 If E belongs to A a 8, then E X2 is -measurable for 
every X 2 ； moreover, ^i{E X2 ) is a -measurable function. In addition 

(7) [ Ml (俨物 2 = (Ml X " 2 )( 五 ). 

Jx 2 

Proof. One notes first that all the assertions hold immediately when 
E 1 is a (measurable) rectangle. Next suppose 五 is a set in Aa- Then we 
can decompose it as a countable union of disjoint rectangles Ej. (If the 
Ej are not already disjoint we only need to replace the Ej by [J k< j — 
Ua ； <j-i Ek.) Then for each X2 we have E X2 = 五 J 2 , an d we observe 

that {Ej 2 } are disjoint sets. Thus by (7) applied to each rectangle Ej 
and the monotone convergence theorem we get our conclusion for each 
set E G A g . 

Next assume E G A a 8 and that ("i x /j/ 2 )(E) < oo. Then there is 
a sequence {Ej} of sets with Ej G Aa, C Ej, and E = H 二 i Ej. 

We let fj(x 2 ) = and f(X 2 ) = ^i(E X2 ). To see that E X2 is 

measurable and f(x 2 ) is well-defined, note that E X2 is the decreasing 
limit of the sets E; 2 , which we have seen by the above are measur¬ 
able. Moreover, since E\ G A a and (fii x // 2 )(-Ei) < oo, we see that 

fj{ x 2 ) /($ 2 )，as j —> cxd for each X 2 - Thus /(❿） is measurable. How¬ 
ever, {/j(X 2 )} is a decreasing sequence of non-negative functions, hence 

/ f(x 2 )d/j j2 (x) = lim / fj(x 2 ) dn 2 {x), 

Jx 2 卜 00 Jx 2 

and therefore (7) is proved in the case when (/ii x ^{E) < oo. Now 
since we assumed both and \i^ are a-finite, we can find sequences F\ C 
F 2 C • • • C C • • • C Xi and Gi C G 2 C • • • C C • • • C X 2 , with 

= x i^ Ujli G j = x 2 , < oo, and f 2 2 (Gj) < oo for all j. 

Then we merely need to replace E by Ej = E D (Fj x Gj), and let j ^ oo 
to obtain the general result. 

We now extend the result in the above proposition to an arbitrary 
measurable set E in X\ x X 2 , that is, E E A1, the a-algebra generated 
by the measurable rectangles. 

Proposition 3.2 If E is an arbitrary measurable set in X, then the 
conclusion of Proposition 3.1 are still valid except that we only assert that 
E X2 is 1 ^ 1 -measurable and ii\{E X2 ) is defined for almost every X 2 G X 2 . 


Proof. Consider first the case when 五 is a set of measure zero. 
Then we know by Proposition 1.6 that there is a set F G A a s such that 
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E C F and ("i x fi2)(F) = 0. Since E X2 C F X2 for every X 2 and F X2 has 
pi-measure zero for almost every X2 by (7) applied to F, the assumed 
completeness of the measure ji2 shows that E X2 is measurable and has 
measure zero for those X 2 . Thus the desired conclusion holds when E 
has measure zero. 

If we drop this assumption on E, we can invoke Proposition 1.6 again 
to find an F G F D E, such that F — E = Z has measure zero. 
Since F X2 — E X2 = Z x，1 we can apply the case we have just proved, and 
find that for almost all the set E X2 is measurable and fj ， i(E X2 )= 
/j ， i(F X2 ) — iii{Z X2 ). From this the proposition follows. 

We now obtain the main result, generalizing Fubini’s theorem in Chap¬ 
ter 2 . 

Theorem 3.3 In the setting above, suppose f(xi^X2) is an integrable 
function on (Xi x X2,/ii x // 2 ). 

(i) For almost every X2 G X2, the slice f X 2 (xi) = f(Xi ， X2) is inte¬ 
grable on (Xi,// 1 ). 

(ii) f Xi /(xi, X2) dii\ is an integrable function on X2 - 

(iii) fx 2 (fx, /OU2) ^Mi) dn2 = f XlX x 2 f x m 

Proof. Note that if the desired conclusions hold for finitely many 
functions, they also hold for their linear combinations. In particular it 
suffices to assume that / is non-negative. When f = Xe, where 五 is a set 
of finite measure, what we wish to prove is contained in Proposition 3.2. 
Hence the desired result also holds for simple functions. Therefore by 
the monotone convergence theorem it is established for all non-negative 
functions, and the theorem is proved. 

We remark that in general the product space (X ， M, p) constructed 
above is not complete. However, if we define the completed space (X, A 4 , fi) 
as in Exercise 2, the theorem continues to hold in this completed space. 
The proof requires only a simple modification of the argument in Propo¬ 
sition 3.2. 


3.2 Integration formula for polar coordinates 

The polar coordinates of a point x eR d — { 0 } are the pair (r, 7), where 
0 < r < 00 and 7 belongs to the unit sphere S^ _1 = {x E M d , \x\ = 1}. 
These are determined by 

x 

( 8 ) r = \x\, 7 = 1~and reciprocally by x = 7 * 7 . 
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Our intention here is to deal with the formula that, with appropriate 
definitions and under suitable hypotheses, states: 


⑼ 


f. f[x)dx = 
JR d 




/Mr "' 1 





For this we consider the following pair of measure spaces. First, 
where X\ = (0, oo), Mi is the collection of Lebesgue mea¬ 
surable sets in (0,oo), and d"i(r) = r d ~ x dr in the sense that fii(E) = 
J E r d ~ x dr. Next, is the unit sphere 5 d_1 , and the measure fj，2 is 
the one in effect determined by (9) with = cr. Indeed given any set 
E C S^ 1 we let E = {x : x/\x\ G E, 0 < |x| < 1} be the “sector” 
in the unit ball whose “end-points” are in E. We shall say E G M .2 
exactly when E 1 is a Lebesgue measurable subset of R d , and define 
= (j(E) = d - m(E), where m is Lebesgue measure in M. d . 

With this it is clear that both (Xi, and (X 2 ,^ 25 /^ 2 ) satisfy 

all the properties of complete and cr-finite measure spaces. We note also 
that the sphere 1 has a metric on it given by d( 7 , 7 ’）= |7 — 7’|, for 
7 , 7 / G S^ 1 . If E is an open set (with respect to this metric) in S d- \ 
then E is open in and hence 五 is a measurable set in S d_1 • 

Theorem 3.4 Suppose f is an integrable function on Then for al¬ 
most every 7 G the slice f 1 defined by / 7 (r) = f{r^) is an integrable 
function with respect to the measure r d ~ x dr. Moreover, f Q / 7 (r)r d_1 dr 
is integrable on 5 d-1 and the identity (9) holds. 

There is a corresponding result with the order of integration of r and 
7 reversed. 

Proof. We consider the product measure fi = /j，i x /j ，2 on Xi x X 2 
given by Theorem 3.3. Since the space Xi x X 2 = {(r, 7 ) : 0 < r < 
00 and 7 G 5 d_1 } can be identified with R d — {0}, we can think of fi 
as a measure of the latter space, and our main task is to identify it with 
the (restriction of) Lebesgue measure on that space. We claim first that 


( 10 ) 


m(E) = fi(E) 


whenever 五 is a measurable rectangle E = Ei x E2, and in this case 
fi(E) = In fact this holds for E 2 an arbitrary measurable 

subset of 5 d_1 and Ei = (0,1), because then E = Ei x E 2 is the sector 
E 2 , while ^(Ei) = 1/d. 

Because of the relative dilation-invariance of Lebesgue measure, (10) 
also holds when E = (0, b) x b > 0. A simple limiting argument then 
proves the result for sets Ei = (0,a], and by subtraction to all open 
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intervals E\ = (a, 6), and thus for all open sets. Thus we have m(Ei x 
E 2 ) = /^ 1 ( 五 1 )// 2 ( 五 2 ) for all open sets Ei, and hence for all closed sets, 
and therefore for all Lebesgue measurable sets. (In fact, we can find 
sets Fi C Ei C 0\ with F\ closed and 0\ open, such that — e < 

^ 1 (-^ 1 ) ^ 爪 i(Fi) + e, and apply the above to F\ x E 2 and Oi x E 2 .) 
So we have established the identity (10) for all measurable rectangles 
and as a result for all finite unions of measurable rectangles. This is 
the algebra A that occurs in the proof of Theorem 3.3, and hence by 
the uniqueness in Theorem 1.5, the identity extends to the a-algebra 
generated by A, which is the cr-algebra M. on which the measure \i is 
defined. To summarize, whenever E G A4, the assertion (9) holds for 
f = Xe. 

To go further we note that any open set in M. d — {0} can be written 
as a countable union of rectangles U^=i A; x Bj, where Aj and Bj are 
open in (0, 00 ) and respectively. (This small technical point is 

taken up in Exercise 12.) It follows that any open set is in and 
therefore so is any Borel set. Thus (9) is valid for xe whenever E is 
any Borel set in R d — {0}. The result then goes over to any Lebesgue 
set E f C~R d — {0}, since such a set can be written as a disjoint union 
E f = E U Z, where 五 is a Borel set and Z C F, with F a Borel set 
of measure zero. To finish the proof we follow the familiar steps of 
deducing (9) for simple functions, and then by monotonic convergence 
for non-negative integrable functions, and from that for the general case. 


3.3 Borel measures on M and the Lebesgue-Stieltjes integral 

The Stieltjes integral was introduced to provide a generalization of the 
Riemann integral J: f(x) dx, where the increments dx were replaced by 
the increments dF(x) for a given increasing function F on [a, b]. We wish 
to pursue this idea from the general point of view taken in this chapter. 
The question that is then raised is that of characterizing the measures 
on M that arise in this way, and in particular measures defined on the 
Borel sets on the real line. 

To have a unique correspondence between measures and increasing 
functions as we shall have below, we need first to normalize these func¬ 
tions appropriately. Recall that an increasing function F can have at 
most a countable number of discontinuities. If Xo is such a discontinuity, 
then 


lim F(x) = F(xq) and 


X > XQ 
X — * XQ 


lim F(x) = F(xq) 
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both exist, while F(xq ) < F(xq) and F(xo) is some value between F(xq) 
and F(xq). We shall now modify F at xo, if necessary, by setting 
F(xo) = F(xq), and we do this for every point of discontinuity. The 
function F so obtained is now still increasing, yet right-continuous at ev¬ 
ery point, and we say such functions are normalized. The main result 
is then as follows. 

Theorem 3.5 Let F be an increasing function on M that is normalized. 
Then there is a unique measure /j, (also denoted by dF) on the Borel 
sets B on M. such that /i((a, b\) = F(b) — F(a) if a < b. Conversely, if 
li is a measure on B that is finite on bounded intervals, then F defined 
by F(x) = /i((0, x]), x > 0, F(0) = 0 and, F(x) = 0]), x < 0, is 

increasing and normalized. 

Before we come to the proof, we remark that the condition that be 
finite on bounded intervals is crucial. In fact, the Hausdorff measures 
that will be considered in the next chapter provide examples of Borel 
measures on M of a very different character from those treated in the 
theorem. 

Proof. We define a function on all subsets of R by 

oo 

j = l 

where the infimum is taken over all coverings of E of the form U=i ( a j, ^ji¬ 
lt is easy to verify that /i* is an exterior measure on ]R. We observe 
next that "*((a ， b]) = (F(b) — F(a)), if a < 6. Clearly b\) < F(b) — 

F ⑷， since (a, 6], then covers itself. Next, suppose that , 

covers (a, 6]; then it covers [a r , b] for any a < a! <b. However, by the 
right-continuity of F, if e > 0 is given, we can always choose b'- > bj such 
that F(bj) < F(bj) + e/2- 7 . Now the union of open intervals • ，巧 ） 

covers [a 7 , b]. By the compactness of this interval, U 二 i( a j，~) covers 
[a 7 , b] for some N. Thus since F is increasing we have 

N N 

m~ F(a') < E F %) - F(aj) < ^(^(6,) - F( aj ) + e/2^) 

j = l j = l 

< "*((a,6]) + e. 

Thus letting a r —>• a, and using the right-continuity of F again, we see 
that F(b) — F(a) < 6]) + e. Since e was arbitrary this then proves 

F(b)-F(a) = M *((aM 
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Next we show that /i* is a metric exterior measure (for the usual 
metric d(x, x r ) = \x — x r \ on the real line). Since is an exterior measure 
we have U E 2 ) < ^(Ei) + /i* ( 五 2 ); thus it suffices to see that the 

reverse inequality holds whenever d(Ei^E 2 ) > 5, for some 5 > 0. 

Suppose that we are given a positive e, and that 1S a 

covering of E\ U 丑 2 such that 

00 

> : F(bj) — F[aj) < U E 2 ) + e. 

j'=i 

We may assume, after subdividing the intervals {cij,bj] into smaller half¬ 
open intervals, that each interval in the covering has length less than 5. 
When this is so each interval can intersect at most one of the two sets E\ 
or E 2 . If we denote by J\ and J 2 the sets of those indices for which (a^-, bj] 
intersects E\ and 五 2 , respectively, then J\ D J 2 is empty; moreover, we 
have Ei C 6^] as well as E 2 C U je j 2 ( a j • 為 ] .Therefore 

(五 1 ) + < ^2 F(bj) — F(aj) + F(bj) - F(aj) 

jeJl 

OO 

^ F(bj) - F(aj) < U E 2 ) + e. 

j=i 

Since e was arbitrary, we see that < "*( 五 1 U E 2 ), as we 

intended to show. 

We can now invoke Theorem 1.5. This guarantees the existence of a 
measure fi for which the Borel sets are measurable; moreover, we have 
b\) = F(b) — F(a), since clearly (a, 6]) is a Borel set and we have 
previously seen that /i*((a,6]) = F(b) — F(a). 

To prove that /i is the unique Borel measure on M for which /i((a, b])= 
F(b) — F(a), let us suppose that v is another Borel measure with this 
property. It now suffices to show that u = /j, on all Borel sets. 

We can write any open interval as a disjoint union (a, b) = UJLi( a j，~], 
by choosing {bj}j? =1 to be a strictly increasing sequence with a < bj < 6, 
6j —> 6 as j —> cxd, and taking ai = a, a^+i = bj. Since v and /i agree on 
each (aj^bj], it follows that v and /i agree on (a,b), and hence on all 
open intervals, and therefore on all open sets. Moreover, clearly v and /i 
are finite on all bounded intervals; thus the regularity in Proposition 1.3 
allows one to conclude that /j, = u on all Borel sets. 

Conversely, if we start with a Borel measure // on M that is finite on 
bounded intervals, we can define the function F as in the statement of the 
theorem. Then clearly F is increasing. To see that it is right-continuous, 
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note that if, for instance, xo > 0, the sets E n = (0, xo + l/n] decrease 
to = (0, xo] as n —> oo, hence —>• since fi(Ei) < oo. This 

means that + l/n) —>• F(xo). Since F is increasing, this implies 

that F is right-continuous at xo. The argument for any xo < 0 is similar, 
and thus the theorem is proved. 


Remarks. Several comments about the theorem are in order. 


(i) Two increasing functions F and G give the same measure if F — 
G is constant. The converse if also true because F(b) — F(a)= 
G(b) — G(a) for all a < 6 exactly when F — G is constant. 

(ii) The measure /i constructed in the proof of the theorem is defined 
on a larger cr-algebra than the Borel sets, and is actually complete. 
However, in applications, its restriction to the Borel sets often suf¬ 
fices. 


(iii) If F is an increasing normalized function given on a closed interval 
[a, 6], we can extend it to R by setting F(x) = F(a) for x < a, and 
F(x) = F(b) for x > b. For the resulting measure the intervals 
(—oo, a] and (6, oo) have measure zero. One then often writes 



f{x) dF(x), 


for every / that is integrable with respect to [i. If F arises from an 
increasing function Fq defined on M, one may wish to account for 
the possible jump of Fq at a. In this case it is sometimes useful to 
define 



f{x) dF(x) 


as 



f(x)diio(x), 


where /io is the measure on M corresponding to Fq. 

(iv) Note that the above definition of the Lebesgue-Stieltjes integral 

extends to the case when F is of bounded variation. Indeed suppose 
F is a complex-valued function on [a, b] such that F = X^ =1 CjFj, 
where each Fj is increasing and normalized, and 6j are 士 1 or 士 i. 
Then we can define f(x) dF(x) as e j la /( x ) dFj(x)' here 

we require that / be integrable with respect to the Borel measure 

= E ?=1 〜，where is the measure corresponding to Fj. 

(v) The value of these integrals can be calculated more directly in the 
following cases. 
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(a) If F is an absolutely continuous function on [a, 6], then 


f(x)dF(x)= f(x)F\x) dx 


for every Borel measurable function / that is integrable with 
respect to /i = dF. 

(b) Suppose F is a pure jump function as in Section 3.3, Chap¬ 
ter 3, with jumps {a n }^ =1 at the points {x n }^ =1 . Then when¬ 
ever / is, say, continuous and vanishes outside some finite 
interval we have 



oo 

f(x) dF(x ) 二 ^2 f( x n)a n 

n=l 


In particular, for the measure /i we have //({x n }) = a n and 
fi(E) = Q for all sets that do not contain any of the x n . 

(c) A special instance arises when F = H, the Heaviside function 
defined by H(x) = 1 for x > 0, and H(x) = 0 for x < 0. Then 



f(x)dH(x) 


/( 0 ), 


which is another expression for the Dirac delta function arising 
in Section 2 of Chapter 3. 


Further details about (v) can be found in Exercise 11. 


4 Absolute continuity of measures 

The generalization of the notion of absolute continuity considered in 
Chapter 3 requires that we extend the ideas of a measure to encompass 
set functions that may be positive or negative. We describe this notion 
first. 

4.1 Signed measures 

Loosely speaking, a signed measure possesses all the properties of a mea¬ 
sure, except that it may take positive or negative values. More precisely, 
a signed measure z/ on a a-algebra is a mapping that satisfies: 

(i) The set function v is extended-valued in the sense that —oo < 
i/(E) < oo for all G 
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(ii) If {Ej} ( ^L 1 are disjoint subsets of A4, then 


v 



oo 


Note that for this to hold the sum ^ ^(Ej) must be independent of 
the rearrangements of terms, so that if Ej) is finite, it implies 

that the sum converges absolutely. 


Examples of signed measures arise naturally if we drop the assumption 
that / be non-negative in the expression 

u ( E ) = [ / 咖， 

J E 

where (X,is a measure space and / is "-measurable. In fact, 
to ensure that v satisfies (i) and (ii) the function / is required to be 
“integrable” with respect to ^ in the extended sense that j f~ dfi must 
be finite, while f /+ dfi may be infinite. 

Given a signed measure v on (X^A4) it is always possible to find a 
(positive) measure fi that dominates i/, in the sense that 


u(E) < fi(E) for all E, 


and that in addition is the “smallest” p that has this property. 

The construction is in effect an abstract version of the decomposition 
of a function of bounded variation as the difference of two increasing 
functions, as carried out in Chapter 3. We proceed as follows. We define 
a function \i/\ on A4, called the total variation of by 


卜 |(E) = sup I] 卜 ( 马 )|， 
j=i 


where the supremum is taken over all partitions of E, that is, over all 
countable unions E = U=i Ej, where the sets Ej are disjoint and belong 
to M. 

The fact that |i/| is actually additive is not obvious, and is given in the 
proof below. 


Proposition 4.1 The total variation \u\ of a signed measure v is itself 
a (positive) measure that satisfies v < \v\. 


4. Absolute continuity of measures 


287 


Proof. Suppose {Ej} < ^ =1 is a countable collection of disjoints sets in 
M, and let = (J Ej. It suffices to prove: 

(11) ( 馬）引 "I ⑻ and |H(^)<^IH(^) - 

Let OLj be a real number that satisfies aj < \u\(Ej). By definition, each 
Ej can be written as Ej = |J ?: Fi,j, where the Fi,j are disjoint, belong to 
M, and 

oo 

a j 

2=1 

Since E = Ui j we have 

^ 2 a j ^ m (五) - 

Consequently, taking the supremum over the numbers aj gives the first 
inequality in (11). 

For the reverse inequality, let Fk be any other partition of E. For a 
fixed fc, {Fk H Ej}j is a partition of Fk, so 

k k j 

since i/ is a signed measure. An application of the triangle inequality and 
the fact that {Fk Pi Ej}k is a partition of Ej gives 

k k j 

j k 

<Eih(^) - 

j 


Since {F^} was an arbitrary partition of E, we obtain the second in¬ 
equality in (11) and the proof is complete. 

It is now possible to write v as the difference of two (positive) measures. 
To see this, we define the positive variation and negative variation 
of v by 
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By the proposition we see that and v~ are measures, and they clearly 
satisfy 

= iy + — u~ and \u\ = . 

In the above if = oo for a set E, then |z/| ( 五 ） = oo, and ^~{E) is 
defined to be zero. 

We also make the following definition: we say that the signed measure 
v is (j-finite if the measure \~u\ is <j- finite. Since v < \v\ and | — v\ = |z/|, 
we find that 


_M < ^ < |^|. 

As a result, if v is cr-finite, then so are and v_ • 

4.2 Absolute continuity 

Given two measures defined on a common a-algebra we describe here the 
relationships that can exist between them. More concretely, consider two 
measures v and /i defined on the a-algebra A4; two extreme scenarios 
are 

(a) v and /i are “supported” on separate parts of M. 

(b) The support of v is an essential part of the support of \i. 

Here we adopt the terminology that the measure v is supported on a 
set A, if iy(E) = u(E D A) for all G A1. 

The Lebesgue-Radon-Nikodym theorem below states that in a precise 
sense the relationship between any two measures v and /i is a combination 
of the above two possibilities. 

Mutually singular and absolutely continuous measures 

Two signed measures v and fi on (X, M) are mutually singular if there 
are disjoint subsets A and S in so that 

i^(E) = z/(A fl _B) and fi(E) = /j,(B D E) for all E E 

Thus v and are supported on disjoint subsets. We use the symbol 
i/ 丄 /i to denote the fact that the measures are mutually singular. 

In contrast, if z/ is a signed measure and /i a (positive) measure on A4, 
we say that v is absolutely continuous with respect to /i if 


( 12 ) 


iy(E) = 0 whenever E ^ M and ^(E) = 0. 
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Thus if v is supported in a set A, then A must be an essential part of the 
support of ii in the sense that > 0. We use the symbol z/ 《 p to 
indicate that v is absolutely continuous with respect to /i. Note that if 
v and [I are mutually singular, and v is also absolutely continuous with 
respect to //, then v vanishes identically. 

An important example is given by integration with respect to [x. In¬ 
deed, if / G /x), or if / is merely integrable in the extended sense 

(where f f_ < oo, but possibly / /+ = oo), then the signed measure v 
defined by 



(13) 


is absolutely continuous with respect to /i. We shall use the shorthand 
dv = fdfi to indicate that v is defined by (13). 

This is a variant of the notion of absolute continuity that arose in 
Chapter 3 in the special case of R (with M. the Lebesgue measurable 
sets and dfi = dx the Lebesgue measure). In fact, with v defined by (13) 
and / an integrable function, we saw that in place of (12) we had the 
following stronger assertion: 


(14) 


For each e > 0 ， there is a 5 > 0 such that fi(E) < S implies |z/(_E)|<e. 

In the general situation the relation between the two conditions (12) 
and (14) is clarified by the following observation. 

Proposition 4.2 The assertion (14) implies (12). Conversely, if \v\ is 
a finite measure, then (12) implies (14). 

That (12) is a consequence of (14) is obvious because fi(E) = 0 gives 
|z/ ( 五 )| < e for every e > 0. To prove the converse, it suffices to consider 
the case when v is positive, upon replacing v by \v\. We then assume 
that (14) does not hold. This means that it fails for some fixed e > 
0. Hence for each n, there is a measurable set E n with fi(E n ) < 2~ n 
while iy(E n ) > e. Now let E* = lim sup n _, OQ E n = fX^ =1 where = 
\J k ^ n E k . Then since //(^*) < J2k>n = l/2 n_1 , and the decreasing 
sets {E^} are contained in a set of finite measure (El), we get = 0. 

However > iy(E n ) > e, and the v measure is assumed finite. So 

z/ ( 五 *) = lim n _，oo > e, which gives a contradiction. 

After these preliminaries we can come to the main result. It guarantees 
among other things a converse to the representation (13); it was proved 
in the case of R by Lebesgue, and in the general case by Radon and 
Nikodym. 
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Theorem 4.3 Suppose is a a-finite positive measure on the measure 
space (X, A4) and v a a-finite signed measure on M.. Then there exist 
unique signed measures v a and v s on M. such that u a <C 丄 /a and 
u = u a u s . In addition，the measure v a takes the form dv a = fdfi; that 
is, 



for some extended fi-integrable function f. 

Note the following consequence. If v is absolutely continuous with respect 
to /i, then dv = fd/i, and this assertion can be viewed as a generalization 
of Theorem 3.11 in Chapter 3. 

There are several known proofs of the above theorem. The argument 
given below, due to von Neumann, has the virtue that it exploits elegantly 
the application of a simple Hilbert space idea. 

We start with the case when both u and [i are positive and finite. Let 
p = z/ + /i, and consider the transformation on L 2 (X, p) defined by 



The mapping £ defines a bounded linear functional on L 2 (X, p) since 



where the last inequality follows by the Cauchy-Schwarz inequality. But 
L 2 (X, p) is a Hilbert space, so the Riesz representation theorem (in Chap¬ 
ter 4) guarantees the existence of ^ G L 2 (X, p) such that 



li E ^ M with p(E) > 0, when we set i/j = \e ^ (15) and recall that 
z/ < p, we find 



from which we conclude that 0 < g{x) < 1 for a.e. x (with respect to the 
measure p). In fact, 0 < f E g(x) dp(x) for all sets E e M implies that 
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g(x) > 0 almost everywhere. In the same way, 0 < f E (l— g(x)) dp(x) 
for dl\ E ^ M. guarantees that g{x) < 1 almost everywhere. Therefore 
we may clearly assume 0 < g{x) < 1 for all x without disturbing the 
identity (15), which we rewrite as 



Consider now the two sets 

A={xEX: 0 < g{x) < 1} and B = {x E X : g(x) = 1}, 
and define two measures u a and v s on M. by 

iy a (E) = iy(A D E) and iy s (E) = u(B fl E). 

To see why 丄 it suffices to note that setting ^ = xb i n (16) gives 



Finally, we set ^ = x^(l + 々 +...+ g n ) in (16) : 



Since (1 — g n+1 )(x) = 0 if x G .B, and (1 — g n+1 ){x) — > 1 if x e A : the 
dominated convergence theorem implies that the left-hand side of (17) 
converges to u(A D E) = Also, 1 + ^ + • • • + converges to 

so we find in the limit that 



Note that / G L 1 (X, //), since z/ a (X) < i/(X) < oo. If /j, and v are cr-finite 
and positive we may clearly find sets Ej G M such that X = \jEj and 


< oo, < oo for all j. 


We may define positive and finite measures on M. by 

fij(E) = D Ej) and fl Ej), 

and then we can write for each j, Uj = Vj a + v^ s where 丄 and 

v j,a — fj dfij. Then it suffices to set 


f = 〉: fj ， = 〉: and v a = v j,a- 
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Finally, if v is signed, then we apply the argument separately to the 
positive and negative variations of v. 

To prove the uniqueness of the decomposition, suppose we also have 
v iy’ s , where v’ a 《 p and 4 丄 //. Then 

v s- 

The left-hand side is absolutely continuous with respect to and the 
right-hand side is singular with respect to ii. Thus both sides are zero 
and the theorem is proved. 

5* Ergodic theorems 

Ergodic theory had its beginnings in certain problems in statistical me¬ 
chanics studied in the late nineteenth century. Since then it has grown 
rapidly and has gained wide influence in a number of mathematical disci¬ 
plines, in particular those related to dynamical systems and probability 
theory. It is not our purpose to try to give an account of this broad 
and fascinating theory. Rather, we restrict our presentation to some of 
the basic limit theorems that lie at its foundation. These theorems are 
most naturally formulated in the general context of abstract measure 
spaces, and thus for us they serve as excellent illustrations of the general 
framework developed in this chapter. 

The setting for the theory is a cr-finite measure space (X, /i) en¬ 

dowed with a mapping r : X ^ X such that whenever E is 3, measurable 
subset of X, then so is t- 工 ⑷)， and Here r~ 1 (E) is 

the pre-image of E under r; that is, r _1 (J5) = {x E X : r(x) G E}. A 
mapping r with these properties is called a measure-preserving trans¬ 
formation. If in addition for such a t we have the feature that it is a 
bijection and r _1 is also a measure-preserving transformation, then r is 
referred to as a measure-preserving isomorphism. 

Let us note that if r is a measure-preserving transformation, then 
/(r(x)) is measurable if f(x) is measurable, and is integrable if / is 
integrable; moreover, then 

(18) / f{r(x)) dn(x) = [ f(x)dn(x). 

J X J X 

Indeed, if xe is the characteristic function of the set E, we note that 
Xe (丁 (工 )） = Xt- 1 (e)( x )j and so the assertion holds for characteristic func¬ 
tions of measurable sets and thus for simple functions, and hence by the 
usual limiting arguments for all non-negative measurable functions, and 
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then integrable functions. For later purposes we record here an equiva¬ 
lent statement: whenever / is a real-valued measurable function and a 
is any real number, then 

fi({x : f(x) > a}) = n({x : /(r(x)) > a}). 

Before we proceed further, we describe several examples of measure¬ 
preserving transformations : 

(i) Here X = Z, the integers, with // its counting measure; that is, 
fi(E) = # ( 五 ） =the number of integers in E, for any E C Z. We 
define t to be the unit translation, r : n t—^ n + 1. Note that r gives 
a measure-preserving isomorphism of Z. 

(ii) Another easy example is X = IR d with Lebesgue measure, and r a 
translation, r : x x h ior some fixed h G M d . This is of course 
a measure-preserving isomorphism. (See the section on invariance 
properties of the Lebesgue measure in Chapter 1.) 

(iii) Here X is the unit circle, given as M/Z, with the measure induced 
from Lebesgue measure on M. That is, we may realize X as the unit 
interval (0,1], and take /i to be the Lebesgue measure restricted 
to this interval. For any real number a, the translation x i—>■ x + 
a, taken modulo Z, is well defined on X = M/Z, and is measure¬ 
preserving. (See the related Exercise 3 in Chapter 2.) It can be 
interpreted as a rotation of the circle by angle 2ira. 

(iv) In this example X is again (0,1] with Lebesgue measure ", but r 
is the doubling map r{x) = 2x mod 1. It is easy to verify that 
r is a me as ure- preserving transformation. Indeed, any set E C 
(0,1] has two pre-images Ei and E 2 , the first in (0,1/2] and the 
second in (1/2,1], both of measure /i(£')/2, if E is measurable. 
(See Figure 1.) However, r is not an isomorphism, since r is not 
injective. 

(v) A trickier example is given by the transformation that is key in 
the theory of continued fractions. Here X = [0,1) and r is defined 
by t{x) = (1/x), the fractional part of 1/x; when x = 0 we set 
r(0) = 0. Gauss observed, in effect, that the measure d/j, = dx 
is preserved by the transformation r. Note that each x G (0,1) has 
infinitely many pre-images under r; that is, the sequence {l/(x + 
fc)}^ =1 . More about this example can be found in Problems 8 
through 10 below. 
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Ei E2 


0 1/2 


E 


1/2 


1 


r 

- > 

Figure 1. Pre-images E\ and E 2 under the doubling map 


Having pointed out these examples, we can now return to the general 
theory. The notions described above are of interest, in part, because they 
abstract the idea of a dynamical system, one whose totality of states is 
represented by the space X, with each point x ^ X giving a particular 
state of the system. The mapping 丁 ' X — X then describes the trans¬ 
formation of the system after a unit of time has elapsed. For such a 
system there is often associated a notion of “volume” or “mass” that is 
unchanged by the evolution, and this is the role of the invariant measure 
fi. The iterates, r n = roro---or (n times) describe the evolution of 
the system after n units of time, and a principal concern is the average 
behaviour, as n —> 00 , of various quantities associated with the system. 
Thus one is led to study averages 

n-l 

(19) A n (f)(x)^-^2f(r\x)), 

k=0 

and their limits as n —^ 00 . To this we now turn. 

5.1 Mean ergodic theorem 

The first theorem dealing with the averages (19) that we consider is 
purely Hilbert-space in character. Historically it preceded both Theo¬ 
rems 5.3 and 5.4 which will be proved below. 

For the specific application of the theorem below, one takes the Hilbert 
space Ti to be L 2 (X, A1, /i). Given the measure-preserving transforma¬ 
tion r on X, we define the linear operator T on 7Y by 

(20) r(/)0) = f{r{x)). 

Then T is an isometry; that is, 

(21) r/n-11/11, 
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where || • || denotes the Hilbert space (that is, the L 2 ) norm. This is clear 
from (18) with / replaced by |/| 2 . Observe that if r were also supposed 
to be a measure-preserving isomorphism, then T would be invertible and 
hence unitary; but we do not assume this. 

Now with T as above, consider the subspace S of invariant vec¬ 
tors, S = {f E H : T(f) = /}. Clearly, because of (21), the subspace 
S is closed. Let P denote the orthogonal projection on this subspace. 
The theorem that follows deals with the “mean” convergence, meaning 
convergence in the norm. 

Theorem 5.1 Suppose T is an isometry of the Hilbert space TL，and let 
P be the orthogonal projection on the subspace of the invariant vectors of 
T. Let A n = ^(/ + T + T 2 + ■ ■ • + T 12 - 1 ). Then for each f eH, A n (f) 
converges to P(f) in norm, as n ^ oo. 

Together with the subspace S defined above we consider the subspaces 
S* 二 {f e K .. T*(f) = /} and 负 ={/ G W : f ^ g-Tg, geH}\ here 
T* denotes the adjoint of T. Then like 5, is closed, but is not 
necessarily closed. We denote its closure by S\. The proof of the theorem 
is based on the following lemma. 

Lemma 5.2 The following relations hold among the subspaces S, S 本 , 
and S\. 

(i) S = S,. 

(ii) The orthogonal complement of S\ is S. 

Proof. First, since T is an isometry, we have that (T/, Tg) = (/, g) 
for all /, ^ G and thus T*T = I. (See Exercise 22 in Chapter 4.) So 
if T/ = / then T*Tf = T*/, which means that f = T* f. To prove the 
converse inclusion, assume T*/ = /. As a consequence (/, T*/ — /) = 0, 
and thus - (/,/) 二 0; that is, (Tf, f) ^ ||/|| 2 . However, \\Tf\\ = 

ll/ll，so we have in the above an instance of equality for the Cauchy- 
Schwarz inequality. As a result of Exercise 2 in Chapter 4 we get Tf = 
c/, which by the above gives Tf = f. Thus part (i) is proved. 

Next we observe that / is in the orthogonal complement of Si ex¬ 
actly when (f,g — Tg) = 0, for all g ^TL. However, this means that 
(/ — T*f,g) = 0 for all 仏 and hence / = T*/, which by part (i) means 
feS. 

Having established the lemma we can finish the proof of the theorem. 
Given any / G 7Y, we write / = /o + /i, where /◦ G 5 and fi G Si (since 
S and Si are orthogonal complements). We also fix e > 0 and pick f[ G 
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Si such that ||/i — f[\\ < e. We then write 

(22) A n (/) = A n (f 0 ) + A n (f[) + A n (fi — / {), 


and consider each term separately. 

For the first term, we recall that P is the orthogonal projection on 5, 
so P(f) = /o, and since T/ 0 = /o we deduce 


n—1 


AMo) ^n^ Tk{fo) -/o-^(/) for every n>l. 


fc =0 


For the second term, we recall the definition of Si and pick a ^ G 7Y 
with f[ = g — Tg. Thus 


n —1 


n—1 




k=0 


k=0 


= —[g — T n (g))_ 
n 

Since T is an isometry, the above identity shows that A n (f[) converges 
to 0 in the norm as n —> oo. 

For the last term, we use once again the fact that each T k is an isometry 
to obtain 


\\A n {fi - /()|| < i [ \\T k {fi - /()|| < ||/i - f[\\ < e. 
k=0 

Finally, from (22) and the above three observations, we deduce that 
lim sup n ^ OG \\A n (f) — P(f)\\ < e, and this concludes the proof of the the¬ 
orem. 


5.2 Maximal ergodic theorem 

We now turn to the question of almost everywhere convergence of the 
averages (19). As in the case of the averages that occur in the differ¬ 
entiation theorems of Chapter 3, the key to dealing with such point wise 
limits lies in estimates for their corresponding maximal functions. In the 
present case this function is defined by 


( 23 ) 


/*(z) = sup 

l<m<oo 


m—1 

-E i/( 兴 ㈤) i. 

k=0 
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Theorem 5.3 Whenever f G L 1 (X ， /j) ， the maximal function /*(x) is 
finite for almost every x. Moreover, there is a universal constant A so 
that 

(24) n{{x : f*(x) > a}) < ^ ||/|| L i(x >Al ) for all a > 0. 

There are several proofs of this theorem. The one we choose emphasizes 
the close connection to the maximal function given in Section 1.1 of 
Chapter 3, and we shall in fact deduce the present theorem from the 
one-dimensional case of that chapter. This argument gives the value 
A = 6 for the constant in (24). By a different argument one can obtain 
A = 1, but this improvement is not relevant in what follows. 

Before beginning the proof, we make some preliminary remarks. Note 
that in the present case the function /* is automatically measurable, 
since it is the supremum of a countable number of measurable functions. 
Also, we may assume that our function / is non-negative, since otherwise 
we may replace it by |/|. 

Step 1. The case when X = Z and t : n h n + 1. 

For each function / on Z, we consider its extension / to R defined by 
f(x) = f(n) for n < a: < n + 1, n G Z. (See Figure 2.) 

/ ⑹ /O) 


-1 n = 0 1 2 -1 x = 0 1 2 

Figure 2. Extension of / to R 


Similarly, if E C Z, denote by E the set in R given by E = [j neE [n,n + 
1). Note that as a result of these definitions we have m(E) = #(五） and 
f R f(x) dx = and thus l/lltH®) = II/lib 1 (Z)- Here m is the 

Lebesgue measure on M, and # is the counting measure on Z. Note also 
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that 


771—1 

L /(n + fc ) 二 
fc =0 



f(n + t) dt. 


However, because J Q m f(n -\-t) dt < J 二 f{x + t) dt whenever x G [n, n + 
1), we see that 


771—1 


m 


^2 f ( n +- 


m + 1 \ 1 


k=0 


m 


m + 1 


f(x + t) dt 


if x G [n, n + 1). 


Taking the supremum over all m > 1 in the above and noting that (m + 
l)/m < 2, we obtain 

(25) /*(n) < 2(/)*(x) whenever x G [n ,n + 1). 


To be clear about the notation here: /*(n) denotes the maximal function 
of / on Z defined by (23), with f ( 丁 k (n)) = f{n + fc), while (/)* is the 
maximal function as defined in Chapter 3, of the extended function / 
on M. 

By (25) 

#({n: /*(n) > a}) < m({x G IR : (/)*(x) > a/2 })， 

and thus the latter is majorized by A r /{a/2) J f(x) dx = 2A’/a||/|| L i( R ), 
according to the maximal theorem for R. The constant A! that occurs in 
that theorem (there denoted by A) can be taken to be 3. Hence we have 

(26) .、{n .. f*(n) > a}) < ^ ||/|| L i(z), 

since H/Hl^m) = H/Hl^z)- This disposes of the special case when X = Z. 
Step 2. The general case. 

By a sleight-of-hand we shall “transfer” the result for Z just proved to 
the general case. We proceed as follows. 

For every positive integer TV, we consider the truncated maximal func¬ 
tion defined as 


/» 二 sup 
l<m<N 


m—1 

V 工咖 _ 


k=0 




5*. Ergodic theorems 


299 


Since {/^(x)} forms an increasing sequence with N ， and linijv-^oo In( x ) = 
/*(x) for every x, it suffices to show that 

(27) : / n ( x ) > a} < ^ ||/||ii(x^), 

with constant A independent of N. Letting iV —> oo will then give the 
desired result. 

So in place of /* we estimate /^, and to simplify our notation we write 
the latter as /*, dropping the N subscript. Our argument will compare 
the maximal function /* with the special case arising for Z. To clarify 
the formula below we temporarily adopt the expedient of denoting the 
second maximal function by A1(/). Thus for a positive function / on Z 
we set 

771—1 

= SU P — V" f(n-\-k). 

I<m rn 


Now starting with a function f on X that is integrable, we define the 
function F on X x Z by 


Then 


F(x,n) 


f(r n (x)) 

0 


if n > 0, 
if n < 0. 


771—1 


m—1 


A m_ = & Z ⑻) 二召 Z F (X ， k). 


k=0 


k=0 


In the above we replace x by r n {x)\ then since r k (r n (x)) = r n+/c (x), we 
have 

771—1 

^-m(f)(T n (x)) = — V] F{x,n + k). 

k=0 


Now we fix a large positive a and set b = a N. We also write Fb for 
the truncated function on X x Z defined by Fb(x,n) = F(x,n) if n < 6, 
Fb(x, n) = 0 otherwise. We then have 

m—l 

軋 (/)( ， ㈤) --E Fb(x, n-\- k) if m < N and n < a. 

m k=0 


Thus 


(28) /*(T n (x)) < M(Fb)(x,n) if n < a. 
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(Recall that /* is actually /^!) This is the comparison of the two maxi¬ 
mal functions we wished to obtain. Now set E a = {x : f*(x) > a}. Then 
by the measure-preserving character of r, /n({x : f*(r n (x)) > a})= 
fi(E a ). Hence on the product space X x Z the product measure /x x # 
of the set {(x,n) G X x Z : /*(r n (x)) > a, 0 < n < a} equals a/j,(E a ). 
However, because of (28) the " x # measure of this set is no more than 



#({n e Z : 


M[Fb)[x ， n) > a}) d/i. 


Because of the maximal estimate (26) for Z, we see that the integrand 
above is no more than 


^||F 如， n)|| il(z ) = ^ X ：/ (今 )), 


with of course ^4 = 6. 

Hence, integrating this over X and recalling that f x f(r n (x)) d/i = 
J x f(x) dfi gives us 

a^(E a ) < — 6 II/Hl^x) = ~ ( a + -^) II/IIlux)* 

Thus fi(E a ) < f (l + f) ll/IUi ⑷, and letting a ― • oo yields estimate (27). 
As we have seen, a final limit as iV —> oo then completes the proof. 


5.3 Pointwise ergodic theorem 

The last of the series of limit theorems we will study is the pointwise 
(or individual) ergodic theorem, which combines ideas of the first two 
theorems. At this stage it will be convenient to assume that the measure 
space (X,//) is finite; we can then normalize the measure and suppose 
"㈤ 二 i. 


Theorem 5.4 Suppose f is integrable over X. Then for almost every 
x E X the averages A m (f) = ^ f ( 丁 k ⑻) converge to a limit as 

m —^ oo. 


Corollary 5.5 If we denote this limit by P\f), we have that 


/ \P'{f)(x)\di 2 (x) < / \f(x)\dn(x). 
'x Jx 


Moreover P’ （ f) = P(f) whenever f G L 2 {X, jj). 
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The idea of the proof is as follows. We first show that A m (/) converges 
to a limit almost everywhere for a set of functions / that is dense in 
L 1 (X, /i). We then use the maximal theorem to show that this implies 
the conclusion for all integrable functions. 

We remark to begin with that because the total measure of X is 1, we 
have L 2 (X, fi) C L\X, jli) and ||/|| L i < ii/ii^, and moreover L 2 (X,") is 
dense in L 1 (X, fi). In fact, if / belongs to L 1 , consider the sequence 
{f n } defined by f n (x) = f(x) if \f(x)\ < n, f n (x) = Q otherwise. Then 
each f n is clearly in L 2 , while by the dominated convergence theorem 

11/ " /n||_Li — 0. 

Now starting with an integrable / and any e > 0 we shall see that we 
can write f = F + H, where ||i/||ii < e, and F = F 0 + (1 — T)G, where 
both F 0 and G belong to L 2 , and T(F 0 ) = F 0 , with T(F 0 ) = F 0 (r(x)). To 
obtain this decomposition of /, we first write f = h\ where f r G L 2 
and < e/2, which we can do in view of the density of L 2 in L 1 

as seen above. Next, since the subspaces S and S^i of Lemma 5.2 are 
orthogonal complements in L 2 , we can find Fq G 5, Fi G 5i, such that 
f f = Fq-\- Fi~\~h with \\h \\ L 2 < e/2. Because 巧 G is automatically 
of the form Fi = (1 — T)G, we obtain f = F H, with F = Fq (1 — 
T)G and H = h-\- h f . Thus ||iif|| L i < \\h\\ L i + IHl 1 and since \\h\\ L i < 
||"||l 2 < e /2 we have achieved our desired decomposition of /. 

Now A m {F) = A m (F 0 ) + A m ((l - T)G) = F 0 + ^(1 - T m (G)),aswe 
have already seen in the proof of Theorem 5.1. Note that ~T rn {G)= 
士 G(r m (x)) converges to zero as m —^ oo for almost every x G X. In¬ 
deed, the series X^=i ^2 (^(^(x))) 2 converges almost everywhere by 
the monotone convergence theorem, since its integral over X is 

00 00 

^ || T - G ||| 2 = || G ||| 2 ^ 

m=l m=l 


which is finite. 

As a result, Am(F)(x) converges for almost every x E X. Finally, 
to prove the corresponding convergence for ^4 m (/)(x), we argue as in 
Theorem 1.3 in Chapter 3 and set 

E a ^ {x: lim sup \A n (f)(x) - A m (f)(x)\ > a}. 
n?m >iv 

Then it suffices to see that //(_E a ) = 0 for all a > 0. However, since 
AnU) - A m (f) = A n (F) - A m (F) + A n (H) - A m (H), and A m (F)(x) con¬ 
verges almost everywhere as m —> 00 , it follows that almost every point 
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in the set E a is contained in E 、， where 


E f a = {x : sup \A n (H)(x) - A m (H)(x)\ > a} 


n,m>N 


and thus < /j,({x : 2sup m |A m (i/)(x)| > a}). The last 

quantity is majorized by A/(a/2)\\H\\ L i < 2eA/a by Theorem 5.3. Since 
e was arbitrary we see that /J-(E a ) = 0, and hence A m (/)(x) is a Cauchy 
sequence for almost every x, and the theorem is proved. 

To establish the corollary, observe that if / G L 2 (X), we know by 
Theorem 5.1 that A m (f) converges to P(/) in the L 2 -norm, and hence 
a subsequence converges almost everywhere to that limit, showing that 
P(f) = P f (f) in that case. Next, for any / that is merely integrable, we 
have 



and thus since A m (/) —• P’ （ f) almost everywhere, we get by Fatou’s 
lemma that f x \P , (f)(x)\ d/j,(x) < J x \ f(x)\ d/j,(x). With this the corol¬ 
lary is also proved. 

It can be shown that the conclusions of the theorem and corollary are 
still valid if we drop the assumption that the space X has finite measure. 
The modifications of the argument needed to obtain this more general 
conclusion are outlined in Exercise 26. 

5.4 Ergodic measure-preserving transformations 

The adjective “ergodic” is commonly applied to the three limit theorems 
proved above. It also has a related but separate usage describing an 
important class of transformations of the space X. 

We say that a measure-preserving transformation r of X is ergodic 
if whenever E is a, measurable set that is “invariant,” that is, E and 
r~ 1 (E) differ by sets of measure zero, then either E or E c has measure 
zero. 

There is a useful rephrasing of this condition of ergodicity. Expanding 
the definition used in Section 5.1 we say that a measurable function / 
is invariant if f(x) = f(r(x)) for a.e. x E X. Then r is ergodic exactly 
when the only invariant functions are equivalent to constants. In fact, 
let r be an ergodic transformation, and assume that / is a real-valued in¬ 
variant function. Then each of the sets E a = {x : f{x) > a} is invariant, 
hence fi(E a ) = 0 or = 0 for each a. However, if / is not equivalent 
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to a constant, then both /J-(E a ) and must have strictly positive 

measure for some a. In the converse direction we merely need to note 
that if all characteristic functions of measurable sets that are invariant 
must be constants, then r is ergodic. 

The following result subsumes the conclusion of Theorem 5.4 for er¬ 
godic transformations. We keep to the assumption of that theorem that 
the underlying space X has measure equal to 1. 

Corollary 5.6 Suppose r is an ergodic measure-preserving transforma¬ 
tion. For any integrable function f we have 



The result has the interpretation that the “time average” of / equals its 
“space average.” 

Proof. By Theorem 5.1 we know that the averages A m (/) converge 
to P(/), whenever / G I/ 2 , where P is the orthogonal projection on the 
subspace of invariant vectors. Since in this case the invariant vectors 
form a one-dimensional space spanned by the constant functions, we 
observe that P(/) = 1(/, 1) = f x /d", where 1 designates the function 
identically equal to 1 on X. To verify this, note that P is the identity on 
constants and annihilates all functions orthogonal to constants. Next we 
write any / G L 1 as ^ + /i, where g 三 L 2 and \\h\\ Ll <e. Then P ! (f ) 二 
P\g) + P’(li). However, we also know that P’(g) = P(g), and < 

\\h\\ L i < e by the corollary to Theorem 5.4. Thus 



yields that \\P\f) — f x /^mIIl 1 ^ \\d — /Ik 1 + e < 2e. This shows that 
J°’(/) is the constant f x f and the assertion is proved. 

We shall now elaborate on the nature of ergodicity and illustrate its 
thrust in terms of several examples. 

a) Rotations of the circle 

Here we take up the example described in (iii) at the beginning of 
Section 5*. On the unit circle M/Z with the induced Lebesgue measure, 
we consider the action r given by x i—> x + a mod 1. The result is 

• The mapping 丁 is ergodic if and only if a is irrational. 
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To begin with, if a is irrational we know by the equidistribution theorem 
that 


(29) 


n—1 


n 


^2f(x + ka) 


f(x) dx 


k=0 


as n —> oo 


for every x if / is continuous on [0,1] and periodic (/(0) = /(l)). The 
argument used to prove this goes as follows. 2 First we verify that (29) 
holds whenever f(x) = e 27Tinx , n G Z, by considering the cases n = 0 and 
n 7 ^ 0 separately. It then follows that (29) is valid for any trigonometric 
polynomial (a finite linear combination of these exponentials). Finally, 
any continuous and periodic function can be uniformly approximated by 
trigonometric polynomials, so (29) goes over to the general case. 

Now if P is the projection on invariant L 2 -functions, then Theorem 5.1 
and (29) show that P projects onto the constants, when restricted to the 
continuous periodic functions. Since this subspace is dense in L 2 , we 
see that P still projects all of L 2 on constants; hence the invariant L 2 - 
functions are constants and thus r is ergodic. 

On the other hand, suppose a = p/q. Choose any set E 0 C (0, 1/q), so 
that 0 < m(E 0 ) < 1/g, and let E denote the disjoint union [JlZ 0 (E 0 + 
r/q). Then clearly E is invariant under r : x x -\- p/q^ and 0 < m(E)= 
qm{E^) < 1; thus r is not ergodic. 

The property (29) we used, which involves the existence of the limit 
at all points, is actually stronger than ergodicity: it implies that the 
measure d/i = dx is uniquely ergodic for this mapping r. That means 
that if v is any measure on the Borel sets of X preserved by t and 
iy(X) = 1, then v must equal /i. 

To see that this so in the present case, let P v be the orthogonal projec¬ 
tion guaranteed by Theorem 5.1, on the space L 2 {X, v). Then (29) shows 
again that the range of P v on the continuous functions, and then on all 
of L 2 {X, z/), is the subspace of constants, and thus P v {f) = fo f 加 . 

This means also that J。 1 f(x) dx = J。 1 / dv whenever / is continuous 
and periodic. By a simple limiting argument we then get that the mea¬ 
sure dx = and v agree on all open intervals, and thus on all open 
sets. As we have seen, this then proves that the two measures are then 
identical. 

In general, uniquely ergodic measure-preserving transformations are 
ergodic, but the converse need not be true, as we shall see below. 

b) The doubling mapping 


2 See also Section 2, Chapter 4 in Book I. 



5*. Ergodic theorems 


305 


We now consider the mapping x i—^ 2x mod 1 for x G (0,1], with [i 
Lebesgue measure, that arose in example (iv) at the beginning of Sec¬ 
tion 5*. We shall prove that r is ergodic and in fact satisfies a different 
and stronger property called mixing. 3 It is defined as follows. 

If r is a measure-preserving transformation on the space (X, /i), it is 
said to be mixing if whenever E and F are a pair of measurable subsets 
then 

(30) "(T _n ( 五） Pi F) — >• ii(E)/j,(F) as n —^ oo. 


The meaning of (30) can be understood as follows. In probability theory 
one often encounters a “universe” of possible events to which probabilities 
are assigned. These events are represented as measurable subsets E of 
some space (X, /j,) with /i(X) = 1. The probability of each event is then 
Two events E and F are “independent” if the probability that 
they both occur is the product of their separate probabilities, that is, 
li{E D F) = fi(E)/j J (F). The assertion (30) of mixing is then that in the 
limit as time n tends to infinity, the sets r~ n (E) and F are asymptotically 
independent, whatever the choices of E and F. 

We shall next observe that the mixing condition is implied by the 
seemingly stronger condition 

(31) {T n f,g) (/,1)(1,^) as n— oo, 

where T n (/)(x) = f(r n (x)) whenever / and g belong to L 2 (X, fi). This 
implication follows immediately upon taking / = \e and g = xf- The 
converse is also true, but we leave its proof as an exercise to the reader. 

We now remark that the mixing condition implies the ergodicity of r. 
Indeed, by (31) 


n-l 

{A n {f),g) = - ^2(T k f ， g) converges to (/, 1 )( 1 , 5 ) - 
k=0 

This means (P[f),g) = (/, 1)(1 ， g), and hence P(f) is orthogonal to all 
g that are orthogonal to constants. This of course means that P is the 
orthogonal projection on constants, and hence t is ergodic. 

We next observe that the doubling map is mixing. Indeed, if f(x) = 
e 2mmx^ g ㈤ =e 2 mkx, then (/ ， 1)(1 ， ") = 0， unless both m and k are 0, 
in which case this product equals 1. However, in this case (T n f ， g)= 
J。 1 e 27rirn2nx e ~ 27rlkx dx, and this vanishes for sufficiently large n, unless 


3 This property is often referred to as a “strongly mixing” to distinguish it from still 
another kind of ergodicity called “weakly mixing.” 



306 


Chapter 6. ABSTRACT MEASURE AND INTEGRATION THEORY 


both m and k are 0, in which case the integral equals 1. Thus (31) 
holds for all exponentials f(x) = e 27rimx , g(x) = e 2?rzfcrr , and therefore by 
linearity for all trigonometric polynomials / and g. It is from there an 
easy step to use the completeness in Chapter 4 to pass to all / and g in 
L 2 ((0,1]) by approximating these functions in the L 2 -norm by trigono¬ 
metric polynomials. 

Let us observe that the action of rotations r \ x x -\- a oi the unit 
circle for irrational a, although ergodic, is not mixing. Indeed, if we take 
f(x) = g(x) = e 27Tirnx , m^O, then (T n f ， g、= e 27rinma (f, g) = e 2 ™ ma , 
while (/, 1) = (1 ， 夕 ）= 0; thus (T n /, g) does not converge to (/, 1)(1,^) 
as n —^ oo. 

Finally, we note that the doubling map r : x 2x mod 1 on (0,1] 
is not uniquely ergodic. Besides the Lesbesgue measure, the measure v 
with i/{l} = 1 but v(E) = 0 if 1 送五 is also preserved by r. 

Further examples of ergodic transformations are given below. 

6* Appendix: the spectral theorem 

The purpose of this appendix is to present an outline of the proof of the spectral 
theorem for bounded symmetric operators on a Hilbert space. Details that are 
not central to the proof of the theorem will be left to the interested reader to fill 
in. The theorem provides an interesting application of the ideas related to the 
Lebesgue-Stieltjes integrals that are treated in this chapter. 

6.1 Statement of the theorem 

A basic notion is that of a spectral resolution (or spectral family) on a Hilbert 
space Ti. This is a function 入 !一>• E(X) from IR to orthogonal projections on 7i that 
satisfies the following: 

(i) E{X) is increasing in the sense that || 五 (A)/|| is an increasing function of A 
for every f E TC. 

(ii) There is an interval [a, b] such that E(X) = 0 if A < a, and E(X) = J if 入仝 6. 
Here I denotes the identity operator on Ti. 

(iii) E{X) is right-continuous, that is, for every A one has 

lim = E(X)f for every f e H. 

fj. > x 

Observe that property (i) is equivalent with each of the following three assertions 
(holding for all pairs A, n with // > 入)： (a) the range of E(ji) contains the range of 
E(X); (b) E(ji)E(X) = E(Xy ， (c) — E(\) is an orthogonal projection. 

Now given a spectral resolution {^(A)} and an element / G 7i, note that the 
function A i-^ [E(X)f,f) = || 五 (A)/|| 2 is also increasing. As a result, the polar¬ 
ization identity (see Section 5 in Chapter 4) shows that for every pair f,gE TC, 
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the function F(A) = (E(X)f,g) is of bounded variation, and is moreover right- 
continuous. With these two observations we can now state the main result. 


Theorem 6.1 Suppose T is a bounded symmetric operator on a Hilbert space Ti. 
Then there exists a spectral resolution { 五 ( 入 )} such that 


T = 



in the 5en5e that for every 

(32) (Tf,g)= T Xd(E(X)f,g)= T XdF(\). 

Ja~ Ja~ 

The integral on the right-hand side is taken in the Lebesgue-Stieltjes sense, as 
in (iii) and (iv) of Section 3.3. 

The result encompasses the spectral theorem for compact symmetric operators T 
in the following sense. Let {(fk} be an orthonormal basis of eigenvectors of T with 
corresponding eigenvalues Afc, as guaranteed by Theorem 6.2 in Chapter 4. In this 
case, we take the spectral resolution to be defined via this orthogonal expansion 
by " " 


EWf - E (/, 

A fc <A 


and one easily verifies that it satisfies conditions (i), (ii) and (iii) above. We also 
note that || 五 (A)/|| 2 = J2\ k <\ \U^k)\ 2 , and thus F(A) = (E(X)f, g) is a pure jump 
function as in Section 3.3 in Chapter 3. 

6.2 Positive operators 

The proof of the theorem depends on the concept of positivity of operators. We 
say that T is positive, written as T > 0, if T is symmetric and (T/, /) > 0 for 
all / G 7i. (Note that (T/, /) is automatically real if T is symmetric.) One then 
writes T\ > T 2 to mean that Ti — T 2 > 0. Note that for two orthogonal projections 
we have E 2 > E\ if and only if ||^ 2 /|| > ||£'i/|| for all / G 7i, and that is then 
equivalent with the corresponding properties (a) —(c) described above. Notice also 
that if S is symmetric, then S 2 = T is positive. Now for T symmetric, let us write 

(33) a = min(T/,/) and b = max(T/,/) for ||/|| < 1. 

Proposition 6.2 Suppose T is symmetric. Then ||T|| < M if and only if—MI < 
T < MI. As a result, ||T|；| = max(|a|, \b\). 

This is a consequence of (7) in Chapter 4. 

Proposition 6.3 Suppose T is positive. Then there exists a symmetric operator 
S (which can be written as T 1 ^ 2 ) such that S 2 = T and S commutes with every 
operator that commutes with T. 
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The last assertion means that if for some operator A we have AT = TA, then 
AS = SA. 

The existence of S is seen as follows. After multiplying by a suitable positive 
scalar, we may assume that ||T|| < 1. Consider the binomial expansion of (1 — 
t) 1 / 2 , given by (1 — t) 1 ^ 2 = bkt k , for |t| < 1. The relevant fact that is needed 

here is that the bk are real and yZk^-o |^fc| < oo. Indeed, by direct calculation of 
the power series expansion of (1 — t) 1 ' 2 we find that 6 o = 1 , &i = — 1 / 2 , 62 = — 1 / 8 , 
and more generally, bk = —1/2 - 1/2 • • • (A: — 3/2)/A;!, if /c > 2, from which it follows 
that bk = 0(k~ 3 ^ 2 ). Or more simply, since bk < 0 when /c > 1, if we let t ^ 1 in 
the definition, we see that — bk = 1 and so | 6 fc| = 2 . 

Now let s n (t) denote the polynomial bkt k . Then the polynomial 

2n 

(34) Sn(t) — (1 — t) = ci,t k 

k=0 

has the property that YlllLo \ c k\ 0 as n 00 . In fact, s n (t) = (1 — t) 1 〆 2 — ⑷， 

with r n (t) = X^fcln+i bkt k , so Sn(t) - (1-t) = -rlif) - 2s n (t)r n (t). Now the left- 
hand side is clearly a polynomial of degree < 2n, and so comparing coefficients with 
those on the right-hand side shows that the are majorized by 3 ^2j >n \bj\ \bk-j\. 
From this it is immediate that |cf | = 0(^ J>n 16-/1) —> 0 as n —^ 00 , as asserted. 

To apply this, set Ti = I — T; then 0 < Ti < /, and thus ||Ti|| < 1, by Proposi¬ 
tion 6.2. Let S n = s n (Ti) = bkTf, with Tf = I. Then in terms of operator 

norms, ||^ - ^|| < E fc > mi n(n,m) \ b k\ 4 0 as n,m — 00 , because \\T^\\ < ||Ti|| fe < 
1. Hence S n converges to some operator S. Clearly S n is symmetric for each n, 
and thus S is also symmetric. Moreover, by (34), S^ — T = c^T^, therefore 

— T\\ < ^ |c^I ^ 0 as n —>• 00 , which implies that S 2 = T. Finally, if A com¬ 
mutes with T it clearly commutes with every polynomial in T, hence with S n , and 
thus with S. The proof of the proposition is therefore complete. 

Proposition 6.4 If T\ and T 2 are positive operators that commute, then T 1 T 2 is 
also positive. 

Indeed, if is a square root of T\ given in the previous proposition, then T 1 T 2 = 
SST 2 = ST 2 S, and hence (T 1 T 2 /, /) = (ST 2 Sf ， f) = (T 25 /, 5/), since S is sym¬ 
metric, and thus the last term is positive. 

Proposition 6.5 Suppose T is symmetric and a and b are given by (33). If p(t)= 
EL〆 is a real polynomial which is positive for t G [a, b], then the operator 
p(T) = J2k=o CkT k is positive. 

To see this, write p(t) = cYl^t - pj) Y\ k {p' k - t) n^((^ - 〜) 2 + W)，where c is pos¬ 
itive and the third factor corresponds to the non-real roots of p(t) (arising in con¬ 
jugate pairs), and the real roots of p(t) lying in (a, b) which are necessarily of 
even order. The first factor contains the real roots pj with pj < a, and the second 
factor the real roots 乂 with > b. Since each of the factors T — pjl, pjl — T 
and (T — /mI ) 2 + ujl is positive and these commute, the desired conclusion follows 
from the previous proposition. 
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Corollary 6.6 If p(t) is a real polynomial, then 

||p(t)|| < sup |p ⑷ |. 

tE[a,b] 

This is an immediate consequence using Proposition 6.2, since —M < p(t) < M, 
where M = sup te [ a b ] |p(t)|, and thus —MI < p(T) < MI. 

Proposition 6.7 Suppose {T n } is a sequence of positive operators that satisfy 
T n > T n +i for all n. Then there is a positive operator T, such that T n f — Tf as 
n — oo for every f £Ti. 

Proof. We note that for each fixed / G 7^ the sequence of positive numbers 
(T n /,/) is decreasing and hence convergent. Now observe that for any positive 
operator S 1 with \\S\\ < M we have 

(35) _|| 2 S(S/ ， /)" 2 M 3/2 ||/||. 

In fact, the quadratic function (S(tl + 5)/, (tl + S)f) = t 2 (Sf ， f) + 2t(Sf, Sf) + 
(S 2 f, Sf) is positive for all real t. Hence its discriminant is negative, that is, 
ll^/)!! 4 < (Sf, f)(S 2 f, Sf), and (35) follows. We apply this to S = T n - T m with 
n < m; then ||T n - T m || < ||T n || < ||7\|| = M, and since ((T n - T m )/, /) ^ 0 as 
n,m — oo we see that \\T n f — T m /|| 0 as n,m —>■ oo. Thus lim n —〜 T n (f)= 

T(f) exists, and T is also clearly positive. 


6.3 Proof of the theorem 

Starting with a given symmetric operator T, and with a, b given by (33), we shall 
now exploit further the idea of associating to each suitable function $ on [a, b] a 
symmetric operator $(T). We do this in increasing order of generality. First, if 
$ is a real polynomial Cfet , then, as before, $(T) is defined as X^fc=o c kT k . 

Notice that this association is a homomorphism : if $ = $i + $ 2 , then $(T)= 
$i(T) + $ 2 (T); also if $ = $1 • $ 2 , then 少 (T 1 ) = $i(T ). 电 2 (T). Moreover, since 
$ is real (and the Ck are real), $(T) is symmetric. 

Next, because every real-valued continuous function $ on [a, b] can be approx¬ 
imated uniformly by polynomials p n (see, for instance, Section 1.8, Chapter 5 of 
Book I), we see by Corollary 6.6 that the sequence p n (T) converges, in the norm of 
operators, to a limit which we call $(T), and moreover this limit does not depend 
on the sequence of polynomials approximating Also, $(T) is automatically a 
symmetric operator. If ^(t) > 0 on [a, b] we can always take the approximating 
sequence to be positive on [a, 6], and as a result $(T) > 0. 

Finally, we define $(T) whenever $ arises as a limit, $(t) = lim n _,oo 
where ($ n (t)} is a decreasing sequence of positive continuous functions on [a, b] . In 
fact, by Proposition 6.7 the limit lim n _,oo ^ n (T) exists by what we have established 
above for $ n . To show that this limit is independent of the sequence {$ n } and 
thus that is well-defined as the limit above, let be another sequence of 

decreasing continuous functions converging to Then whenever e > 0 is given and 
k is fixed, < ^k(t) + e for all n sufficiently large. Thus ^n(T) < $fe(T) + el 

for these n, and passing to the limit first in n, then in k, and then with e —> 0, we get 
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lim n _,oo $n(T) < limfc—oo $fc(T). By symmetry, the reverse inequality holds, and 
the two limits are the same. Note also that for a pair of these limiting functions, 
if ^i(t) < $ 2 (t) for t e [a, 6], then $i(T) < $ 2 (T). 

The basic functions 伞 ， $ = , that give us the spectral resolution are defined 

for each real A by 

(f X (t) = 1 if t < A and p 入 ⑷ = 0 if A < t. 

We note that = lim(^ ⑴， where — 1 if t < A, = 0 lit > X + 1/n, 

and (pn(t) is linear for t G [A, A + 1/n]. Thus each (p x (t) is a limit of a decreasing 
sequence of continuous functions. In accordance with the above we set 

E(X) = /(T). 

Since lim n —oo (fn 1 (t)(p n 2 (t) = (t) whenever Ai < A 2 , we see that 五 (Ai) 五 (A 2 )= 

E(Ai). Thus E(X) 2 = E(X) for every A, and because E(X) is symmetric it is 
therefore an orthogonal projection. Moreover, for every / G 7^ 

ll^(Ai)/|| = HE(M)E(M)fl < ]|E(A 2 )/]|, 

thus E(X) is increasing. Clearly E(X) = 0 if A < a, since for those A, (p x (t) = 0 on 
[a, b]. Similarly, E(X) = I for 入 $ 6. 

Next we note that E(X) is right-continuous. In fact, fix / G 7^ and e > 0. Then 
for some n, which we now keep fixed, \\E(X)f — (f n (T)f\\ < e. However, 
converges to uniformly in t as /x —> A. Hence sup t \(fn(t) — ^(t)| < e, if 

_ 入 | 〈占 ， for an appropriate 5. Thus by the corollary ||^(T) — (f n (T)\\ < e 
and therefore \\E(X)f — (pf^{T)\\ < 2e. Now with /x > A we have that E(ji)E(X)= 
E(X) and 五 ("X(T) = E{^). As a result \\E(X)f 二五 (")/|| < 2e, if A $ /x S 入 + 
Since e was arbitrary, the right continuity is established. 

Finally we verify the spectral representation (32). Let a = Ao < Ai < ... 〈入 fc = 
b be any partition of [a, b] for which sup^^Aj — Aj_i) < <5. Then since 

k 

t = ^2 咖〜 ⑷ - ^ Aj_i (t))+^ A °(t) 

j=i 

we note that 

k 

t < 5 ^\(一 W - ^ Aj_1 W) + Ao^ a ° (t) 

J=1 

Applying these functions to the operator T we obtain 

k 

T {E(\j) - S(A J --i)) + XoE(Xo) <T + 8I, 
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and thus T differs in norm from the sum above by at most 8. As a result 


(27，/)- 






<mr 


But as we vary the partitions of [a, 6], letting their meshes S tend to zero, the 
above sum tends to A d(E(X)f, /). Therefore (T/, /) = Xd(E(X)f, /), and 
the polarization identity gives (32). 

A similar argument shows that if $ is continuous on [a, b ], then the operator 
$(T) has an analogous spectral representation 

(36) ( 电 (T)f,g)= 「 $(A) d(E(X)f,g). 

J a~ 

This is because \^(t) — (t) — p 入 J_1 ⑷)一 $(Ao)^ A °(^)| < ^, where 

8’ = sup| t _ t /| <5 |$(t) — 少 which tends to zero as J —> 0. 

This representation also extends to continuous $ that are complex-valued (by 
considering the real and imaginary parts separately) or for $ that are limits of 
decreasing pointwise continuous functions. 


6.4 Spectrum 

We say that a bounded operator on is invertible if 5 is a bijection of Ti 
and its inverse, S' -1 , is also bounded. Note that 5 _1 satisfies S-S = SS _1 = I. 
The spectrum of S, denoted by is the set of complex numbers z for which 

5 — 2:7 is not invertible. 

Proposition 6.8 If T is symmetric, then a(T') a closed subset of the interval 
[a, b] given by (33). 

Note that if ^ ^ [a, 6], the function $(t) = (t — z)~ x is continuous on [a, b] and 
$(T)(T — zl) = (T — zI)^(T) = /, so $(T) is the inverse of T — zl. Now suppose 
To = T — Xol is invertible. Then we claim that To — el is invertible for all (com¬ 
plex) e that are sufficiently small. This will prove that the complement of cr(T) is 
open. Indeed, To — el = To (/ — cTq - 1 ), and we can invert the operator (I — eT 0 _1 ) 
(formally) by writing its inverse as a sum 


它印。- 1 广 +1 . 

n=0 


Since J2n=o ||e n (r 0 _1 ) n+1 || < |e| n ||T 0 _1 || n+1 , the series converges when |e| < ||r o _1 || 

and the sum is majorized by 


(37) 


r 。- 1 


抓 _1 


Thus we can define the operator (To — e/) _1 as limiv-^oo Tq 1 e n (T 0 _1 ) n ' 1 " 1 , 

and it gives the desired inverse, as is easily verified. 

Our last assertion connects the spectrum cr(T) with the spectral resolution 
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Proposition 6.9 For each f E Ti, the Lebesgue-Stieltjes measure corresponding 
to F(X) = supported on cr(T). 

To put it another way, F(X) is constant on each open interval of the complement 
of cr(T). 

To prove this, let J be one of the open intervals in the complement of cr(T), 
xo G J, and Jo the sub-interval centered at xo of length 2e, with e < ||(T — xol) -1 1|. 
First note that if 之 has non-vanishing imaginary part then (T — zl)~ x is given by 
$ z (r), with ^ z (t) = (t-z)- 1 . Hence {T - ziy 1 ^ -ziy 1 is given by ^ 2 (T), 
with ^ z (t) = l/\t — z\ 2 . Therefore by the estimate given in (37) and the represen¬ 
tation (36) applied to $ = 屯 2 , we obtain 



as long as z is complex and |a：o — z\ < e. We can therefore obtain the same in¬ 
equality for x real, \xo — x\ < e. Now integration in a; G Jo using the fact that 
fj | A ^| 2 = oo for every A G «7 e , gives fj (IF ( 入 ）= 0. Thus _F(A) is constant in J e , 
but since xo was an arbitrary point of J the function F ( 入） is constant throughout 
J and the proposition is proved. 

7 Exercises 

1. Let X be a set and M a non-empty collection of subsets of X. Prove that if 
M is closed under complements and countable unions of disjoint sets, then M is 
a cr-algebra. 

[Hint: Any countable union of sets can be written as a countable union of disjoint 
sets.] 

2. Let fj,) be a measure space. One can define the completion of this 

space as follows. Let M be the collection of sets of the form E U Z, where E G M, 
and Z C F with F £ M and "(F) = 0. Also, define Jl(E U Z) = Then: 

(a) A4 is the smallest cr-algebra containing A4 and all subsets of elements of M 
of measure zero. 

(b) The function /Z is a measure on M, and this measure is complete. 

[Hint: To prove is a cr-algebra it suffices to see that if E\ C M ) then Ei C M. 
Write Ex =EUZ with Z C F, £； and F in M. Then E} = (E U F) c U (F - Z)] 

3. Consider the exterior Lebesgue measure m* introduced in Chapter 1. Prove that 
a set E in R d is Caratheodory measurable if and only if E is Lebesgue measurable 
in the sense of Chapter 1. 

[Hint: If E is Lebesgue measurable and A is any set, choose a Gs set G such 
that A C G and m*(A) = m(G). Conversely, if E is Caratheodory measurable and 
m*(E) < oo, choose a Gs set G with E C G and m*(E) = m*(G). Then G — E 
has exterior measure 0.] 
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4. Let r be a rotation of R d . Using the fact that the mapping x i—> r{x) preserves 
Lebesgue measure (see Problem 4 in Chapter 2 and Exercise 26 in Chapter 3), show 
that it induces a measure-preserving map of the sphere 5^— 1 with its measure da. 

A converse is stated in Problem 4. 

5. Use the polar coordinate formula to prove the following: 

(a) e ~ n \ x \ 2 d x — when d = 2. Deduce from this that the same identity 
holds for all d. 

(b) (/ 0 °° dd-i dr ) a ( S d-i) = 1； an d as a result, o-(S ,d ~ 1 ) = 2n d/2 /T(d/2). 

(c) If B is the unit ball, Vd = m(B) = 7v d ^ 2 /T(d/2 + 1), since this quantity 

equals (J。 1 r d_1 dr) (See Exercise 14 in Chapter 2.) 


6. A version of Green’s formula for the unit ball B in R d can be stated as follows. 
Suppose u and v are a pair of functions that are in C 2 (B). Then one has 


/ (vAu — uAv) dx = 
J B 



du 


dv 、 

u d^, 


da. 


Here 5 d_1 is the unit sphere with da the measure defined in Section 3.2, and 
du/dn, dv/dn denote the directional derivatives of u and v (respectively) along 
the inner normals to S^ 1 • 

Show that the above can be derived from Lemma 4.5 of the previous chapter by 
taking rj = and letting e —^ 0. 


7. There is an alternate version of the mean-value property given in (21) of Chap¬ 
ter 5. It can be stated as follows. Suppose u is harmonic in Q, and B is any ball 
of center xo and radius r whose closure is contained in Then 

u{xq) = c u(xq + ry) da(y), with c _1 = a(S d ~ 1 ). 

J 3 d - 1 

Conversely, a continuous function satisfying this mean-value property is harmonic. 

[Hint: This can be proved as a direct consequence of the corresponding result 
for averages over balls (Theorem 4.27 in Chapter 5), or can be deduced from 
Exercise 6.] 

8. The fact that the Lebesgue measure is uniquely characterized by its translation 
invariance can be made precise by the following assertion: If /x is a Borel measure 
on that is translation-invariant, and is finite on compact sets, then /x is a 
multiple of Lebesgue measure m. Prove this theorem by proceeding as follows. 

(a) Suppose Q a denotes a translate of the cube {x : 0 < Xj < a, j = 1,2,... ,d} 
of side length a. If we let = c, then "(Qi/ n ) = cn~ d for each integer n. 
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(b) As a result 〆 is absolutely continuous with respect to m, and there is a 
locally integrable function / such that 



(c) By the differentiation theorem (Corollary 1.7 in Chapter 3) it follows that 
f(x) = c a.e., and hence /x = cm. 


[Hint: Qi can be written as a disjoint union of n d translates of Qi/ n .] 


9 . Let C([a, &]) denote the vector space of continuous functions on the closed and 
bounded interval [a, b]. Suppose we are given a Borel measure /i on this interval, 
with /x([a, b]) < oo. Then 



is a linear functional on C([a, 6]), with £ positive in the sense that £(f) > 0 if / > 0. 

Prove that, conversely, for any linear functional i on C([a, 6]) that is positive in 
the above sense, there is a unique finite Borel measure fi so that 1(f) = f dfi for 

feC([a,b}). . 

[Hint: Suppose a = 0 and w > 0. Define F(u) by F{u) = lim e _，o ^(/e)? where 


1 for 0 < x < u. 
0 for u e < x, 


Mx) 


and f e is linear between u and u-\-e. (See Figure 3.) Then F is increasing and 
right-continuous, and £(f) can be written as J: f(x) dF{x) via Theorem 3.5.] 

The result also holds if [a, b] is replaced by a closed infinite interval; we then 
assume that i is defined on the continuous functions of bounded support, and 
obtain that the resulting // is finite on all bounded intervals. 

A generalization is given in Problem 5. 

10 . Suppose u, v\ , U 2 are signed measures on (X, M) and a (positive) measure 
on M. Using the symbols 丄 and 《 defined in Section 4.2, prove: 

(a) If vi 丄 " and 以 2 丄 "，then z/i + z/2 丄 

(b) If z/i 《 "and z^ 2 《 then w 

(c) i/i 丄 P 2 implies |^i| 丄 > 2 |. 

(d) v < \u\. 

(e) If v 丄 /x and u fi, then u = 0. 


11 . Suppose that F is an increasing normalized function on M, and let F = 
Fa + Fc + Fj be the decomposition of F in Exercise 24 in Chapter 3; here Fa is 
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absolutely continuous, Fc is continuous with Fq = 0 a.e, and Fj is a pure jump 
function. Let // = \ia + + with /xa, Me, and fij the Borel measures 
associated to F, Fa, Fc, and Fj, respectively. Verify that: 

(i) \xa is absolutely continuous with respect to Lebesgue measure and ^a(E)= 
f E F f (x) dx for every Lebesgue measurable set E. 

(ii) As a result, if F is absolutely continuous, then f f dfi = f f dF = 
f f{x)F'{x) dx whenever / and fF f are integrable. 

(iii) fic + and Lebesgue measure are mutually singular. 


12. Suppose — {0} is represented as R+ x with R+ = {0<r<oo}. 

Then every open set in IR d — {0} can be written as a countable union of open 
rectangles of this product. 

[Hint: Consider the countable collection of rectangles of the form 

{rj < r < rfc} X {7 G ^ -1 : | 7 -7^1 < V n }. 

Here Vj and r' k range over all positive rationals, and { 7 ^} is a countable dense set 
of ^ -1 .] 


13. Let rrij be the Lebesgue measure for the space j = 1,2. Consider the 
product R d = R dl x R d2 (d = c/i + cfe), with m the Lebesgue measure on Show 
that m is the completion (in the sense of Exercise 2) of the product measure 
mi x m 2 . 

14. Suppose (Xj’Mj, fij), 1 < j < /c, is a finite collection of measure spaces. 
Show that parallel with the case k = 2 considered in Section 3 one can construct 
a product measure /^i x /^2 X • • • X /ifc on X = X\ x X 2 x ■ • ■ x X^. In fact, for 
any set E C X such that E = E\ x E 2 x ... X Ek, with Ej C Mj for all j, define 
f^o(E) = rij = i pj(Ej). Verify that "0 extends to a premeasure on the algebra A 
of finite disjoint unions of such sets, and then apply Theorem 1.5. 
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15. The product theory extends to infinitely many factors, under the requisite 
assumptions. We consider measure spaces (Xj , Mj, fij ) with /ij(Xj) = 1 for all 
but finitely many j. Define a cylinder set E as 

{x = (xj)^ Xj G Ej, Ej G Mj , but Ej = Xj for all but finitely many j}. 

For such a set define f^o(E)= nr= ^ fij(Ej). If A is the algebra generated by the 
cylinder sets, jiQ extends to a premeasure on A, and we can apply Theorem 1.5 
again. 

16. Consider the d-dimensional torus T d = R d /Z d . Identify T d as T 1 x • • • X T 1 

(d factors) and let fi be the product measure on T d given by /x = /xi x "2 x ... x …， 
where fij is Lebesgue measure on Xj identified with the circle T. That is, if we 
represent each point in Xj uniquely as Xj with 0 < < 1, then the measure [ij is 

the induced Lebesgue measure on R 1 restricted to (0,1]. 

(a) Check that the completion fi is Lebesgue measure induced on the cube 
Q = {x : 0 < Xj < 1, j = 1,... ,d}. 

(b) For each function f on Q let / be its extension to which is periodic, that 
is, f(x z) = f(x) for every 2 : G Then / is measurable on T d if and 
only if / is measurable on R d , and / is continuous on T d if and only if / is 
continuous on 

(c) Suppose / and g are integrable on T d . Show that the integral defining 
(/ * 9)( x ) — fjd f[x — y)g(y) dy is finite for a.e. x, that f * g is integrable 
over T d , and that f 本 g = g 本 f. 

(d) For any integrable function / on T d , write 

/ \ — ^ 2-Kin-x 

^ 2^ ane 

n£Z d 

to mean that a n = f Jd f(x)e~ 27rtn ' x dx. Prove that if g is also integrable, 
and p 〜 J2nezd 6 n e 27rm ' a; , then 

/ \ — ^ 7 2nin-x 

^9 ^ , CLnb n e • 

n£Z d 

(e) Verify that {e 27rtn a: } neZ d is an orthonormal basis for L 2 (T d ). As a result 

ll/llL 2 (T d ) ~ l an | 2 * 

(f) Let / be any continuous periodic function on T d . Then / can be uniformly 
approximated by finite linear combinations of the exponentials {e 27rin ' x } nGZ d. 

[Hint: For (e), reduce to the case d = 1 by Fubini’s theorem. To prove (f) let 
g{x) = g e (x) = e~ d , if 0 < Xj < e, j = 1,..., and g e {x) = 0 elsewhere in Q. Then 
(/ * 9 ^)(x) f(x) uniformly as e —^ 0. However (/ * ge)(x) = ^ a n b n e 27ri,nx with 
b n = f T d ge(x)e~ 2nin x dx, and \a n b n \ < 00 .] 
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IT. By reducing to the case d = 1, show that each “rotation” x x a of the 
torus T d = R d /Z d is measure preserving, for any a G R d . 

18. Suppose r is a measure-preserving transformation on a measure space (X, fj,) 
with fi(X) = 1. Recall that a measurable set E is invariant if r~ 1 (E) and E differ 
by a set of measure zero. A sharper notion is to require that r~ 1 {E) equal E. 
Prove that if E is any invariant set, there is a set E' so that E' — r~ 1 (E , ), and E 
and E' differ by a set of measure zero. 

[Hint: Let E' = lim SUPrl — 〜 { 厂 "^)} = 门 = 0 (U fc >„ r- fe (B)).] 

19. Let r be a measure-preserving transformation on (X, /x) with /i(X) = 1. Then 
t is ergo die if and only if whenever v is absolutely continuous with respect to " and 
v is invariant (that is, u{t _ 1 {E)) = i^{E) for all measurable sets E), then v = c", 
with c a constant. 


20. Suppose r is a measure-preserving transformation on If 


K T - n (E)nF) — f4EMF) 


as n — oo for all measurable sets E and F, then (T n f, g) — (/, 1)(1, g) whenever 
/, ^ G L 2 (X) with (Tf)(x) = f (丁 (x)). Thus t is mixing. 

[Hint: By linearity the hypothesis implies the conclusion whenever / and g are 
simple functions.] 


21. Let T d be the torus, and r : x x a the mapping arising in Exercise 17. 
Then r is ergodic if and only if a = (ai, … ， ad) with ai, o ； 2 ,...，and 1 are 
linearly independent over the rationals. To do this show that: 


(a) 


i x ~-T r 

— f(r k (x)) / f(x) dx as m —> oo, for each x G T d , 
171 k=o Jt<1 

continuous and periodic and a satisfies the hypothesis. 


whenever / is 


(b) Prove as a result that in this case r is uniquely ergodic. 


[Hint: Use (f) in Exercise 16.] 


22. Let X = n=i where each (Xj, fj ， i) is identical to (JG_,"i), with fii(Xi)= 
1, and let /x be the corresponding product measure defined in Exercise 15. Define 
the shift r : X ^ X by r((xi, j ： 2 , ••.)) = ( 工 2 , 尤 3,…） for = (xi) G EISi 不 • 


(a) Verify that r is a measure-preserving transformation. 

(b) Prove that r is ergodic by showing that it is mixing. 

(c) Note that in general r is not uniquely ergodic. 


If we define the corresponding shift on the two-sided infinite product, then r is 
also a measure-preserving isomorphism. 
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[Hint: For (b) note that fj,(r~ n (E fl F)) = fj,(E)/i(F) whenever E and F are cylin¬ 
der sets and n is sufficiently large. For (c) note that, for example, if we fix a point 
x G Xi, the set E = {(a ； i) : Xj = x all j} is invariant.] 

23. Let X = Z(2), where each factor is the two-point space Z(2) = {0,1} 

with "i(0) = "i(l) = 1/2, and suppose " denotes the product measure on X. Con¬ 
sider the mapping D : X —> [0,1] given by D({aj}) —> 备 . Then there are 

denumerable sets Z± G X and Z 2 C [0,1], such that: 

(a) D is a bijection from X — Z\ to [0, 1] — Z 2 . 

(b) A set 五 in X is measurable if and only if D(E) is measurable in [0,1], and 
p(E) = m(D(E)), where m is Lebesgue measure on [0,1]. 

(c) The shift map on「[=：i Z ⑶ then becomes the doubling map of example (b) 
in Section 5.4. 

24. Consider the following generalization of the doubling map. For each integer 
m, m > 2 , we define the map r m of ( 0 , 1 ] by r(x) = mx mod 1 . 

(a) Verify that r is measure-preserving for Lebesgue measure. 

(b) Show that r is mixing, hence ergodic. 

(c) Prove as a consequence that almost every number x is normal in the scale m, 
in the following sense. Consider the m-adic expansion of x, 

00 

x = -^ 7 , where each aj is an integer 0 < aj < m — 1. 

j=i mJ 

Then x is normal if for each integer k, 0 < k < m — 1, 

#{j : aj = k, 1 < j < n\ 1 
N m 

Note the analogy with the equidistribution statements in Section 2, Chap¬ 
ter 4, of Book I. 


25. Show that the mean ergodic theorem still holds if we replace the assumption 
that T is an isometry by the assumption that T is a contraction, that is, ||T/|| < 

ll/ll for all fen. 

[Hint: Prove that T is a contraction if and only if T* is a contraction, and use the 
identity (f,T*f) = (Tf,f).] 

26. There is an L 2 version of the maximal ergodic theorem. Suppose r is a 
measure-preserving transformation on (X,"). Here we do not assume that /x(X) < 
00 . Then 
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satisfies 

II/1l2(x) < c\\f\\ L 2 (x) , whenever / G L 2 (X). 

The proof is the same as outlined in Problem 6, Chapter 5 for the maximal function 
on R d . With this, extend the pointwise ergodic theorem to the case where = 
oo, as follows: 

(a) Show that lim m _,oo ^ f ( 丁 k ( x )) converges for a.e. x to P(f)(x) for 

every / G L 2 (X), because this holds for a dense subspace of L 2 (X). 

(b) Prove that the conclusion holds for every / G L 1 ^), because it holds for 
the dense subspace L 1 (X) n L 2 (X). 


27. We saw that if H/nlUa < 1, then ’f) — > 0 as n ^ oo for a.e. x. However, show 

that the analogue where one replaces the L 2 -norm by the LLnorm fails, by con¬ 
structing a sequence {/n}, fn G ||/n|Ui $ 1， but with limsup n ^ oc = 

oo for a.e. x. 

[Hint: Find intervals I n C [0,1], so that m(I n ) = 1/(n log n) but limsup n ^ 00 {/ n }= 
[0,1]. Then take f n (x) = n log nxi n •] 

28. We know by the Borel-Cantelli lemma that if {E n } is a collection of measurable 

sets in a measure a space (X, //) and f^(E n ) < oo then E = limsup rwoo {_E n } 

has measure zero. 

In the opposite direction, if r is a mixing measure-preserving transformation 
on X (with fi(X) = 1), then whenever fi(E n ) = oo, there are integers m = 

m n so that if E' n — r _rn，Tl (E n ), then limsup n ^ 00 (^) — X, except for a set of 
measure 0. 


8 Problems 


1. Suppose $ is a C 1 bijection of an open set O in onto another open set O' 
in R d . 


(a) If 五 is a measurable subset of O, then ^(E) is also measurable. 


(b) m(^(E)) = f E I det$’(a;)| dx, where ^ is the Jacobian of 

(c) f 0 , f(y) dy = f 0 f(^(x)) | det $’(x)| dx whenever / is integrable on O'. 

[Hint: To prove (a) follow the argument in Exercise 8, Chapter 1. For (b) assume 
五 is a bounded open set, and write E as IJ 二 iQj, where Qj are cubes whose 
interiors are disjoint, and whose diameters are less than e. Let Zk be the center of 
Qk. Then if x G Qk, 


$( 工） = ^(z k ) + ^\z k ){x - z k ) + o(e), 
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hence 中 (Qk) = ^(zk) + ^(zk)(Qk — Zk) + o(e), and as a result (1 - r](e))^ f (zk)(Qk 
Zk) C 屯 (Qk) - ^(zk) C (1 + 7](e))^(zk)(Qk — Zk), where 77 (e) — 0 as e — 0. This 
means that 

m(^(0)) = ^2 m (^(Qk)) = ^2 I det($ / (Zfc))| m(Qk) 4- o(l) as e —> 0 

k k 

on account of the linear transformation property of the Lebesgue measure given in 
Problem 4 of Chapter 2. Note that (b) is (c) for = xe(x).] 


2. Show as a consequence of the previous problem: the measure dfi = in the 
upper half-plane IR+ = {z = x -\- iy, y > 0} is preserved by any fractional linear 

transformation 2 ： 1 —^ cz+d , where ( : = ) belongs to SL 2 (M). 

3. Let S' be a hypersurface in ]R d = ]R d_1 x R, given by 

S = {(x,y) G R d_1 xR:y = F(x)}, 

with jP a C 1 function defined on an open set Q in M d_1 . For each subset E C 
we write E for the corresponding subset of 5 given by 左 ={(x, F(x)) x G E}. We 
note that the Borel sets of S can be defined in terms of the metric on S (which is 
the restriction of the Euclidean metric on R d ). Thus if 五 is a Borel set in then 
五 is a Borel subset of S. 

(a) Let " be the Borel measure on S given by 

H(E) = [ y/l + \VFfdx. 

J E 

If 5 is a ball in let B 6 = {(x, y) G M d , d((x, y), B) < J}. Show that 

m(-B) = lim 

5^0 zo 


where m denotes the d-dimensional Lebesgue measure. This result is anal¬ 
ogous to Theorem 4.4 in Chapter 3. 


(b) One may apply (a) to the case when S is the (upper) half of the unit sphere 
in R d , given by y = F(x), F(x) = (1 — ㈣ 2 ) 1 / 2 ， |x| < 1, x G R d_1 . Show 
that in this case dfi = da, the measure on the sphere arising in the polar 
coordinate formula in Section 3.2. 

(c) The above conclusion allows one to write an explicit formula for da in 
terms of spherical coordinates. Take, for example, the case d = 3, and 
write y = cos 6, x = (x±,X2) = (sin 0 cos (f, sin 6 sin ip) with 0 < ^ < 7 r/ 2 , 0 < 
ip < 27r. Then according to (a) and (b) the element of area da equals 
(1 — |;r| 2 )— 工 / 2 dx. Use the change of variable theorem in Problem 1 to deduce 
that in this case da = sin 9 dO d(p. This may be generalized to d dimensions, 
d> 2, to obtain the formulas in Section 2.4 of the appendix in Book I. 
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4.* Let /x be a Borel measure on the sphere S^ 1 which is rotation-invariant in the 
following sense: fi(r(E)) = for every rotation r of and each Borel subset 

E of 5 d_1 . If fi(S d —i)< oo, then /x is a constant multiple of the measure a arising 
in the polar coordinate integration formula. 

[Hint: Show that 



Y k (x) d/i(x) 


= 0 


for every surface spherical harmonic of degree k > 1. As a result, there is a constant 
c so that 




f da 


for every continuous function / on 5 d_1 .] 


5. * Suppose X is a metric space, and /x is a Borel measure on X with the property 
that fi(B) < oo for every ball B. Define Co(X) to be the vector space of continuous 
functions on X that are each supported in some closed ball. Then £(f) = f x f d\i 
defines a linear functional on Co(X) that is positive, that is, £(f) > 0 if / > 0. 

Conversely, for any positive linear functional £ on Co(X), there exists a unique 
Borel measure [i that is finite on all balls, such that £(f) = f f dfi. 

6. Consider an automorphism A of = R d /Z d , that is, A is a linear isomorphism 
of R d that preserves the lattice Z d . Note that A can be written as a d x d matrix 
whose entries are integers, with det A = 士 1. Define the mapping r : T d ^ T d by 
t(x) = A(x). 

(a) Observe that r is a measure-preserving isomorphism of T d . 

(b) Show that r is ergodic (in fact, mixing) if and only if A has no eigenvalues 
of the form e 27r4p ^, where p and q are integers. 

(c) Note that r is never uniquely ergodic. 

[Hint: The condition (b) is the same as (A £ ) 9 has no invariant vectors, where A 1 is 
the transpose of A. Note also that f(r k (x)) = e 27ri ( At ) k ( n )- x where f(x) = e 27rxn ' x ^ 


7.* There is a version of the maximal ergodic theorem that is akin to the “rising 
sun lemma” and Exercise 6 in Chapter 3. 

Suppose / is real-valued, and f#(x) = sup m ^ ^2^=0 /(r fc (a:)). Let Eo = {x : 
/#(x) > 0}. Then 

[f(x) dx > 0. 

As a result (when we apply this to f(x) — a), we get when / > 0 that 
: fix) > a} < - [ f(x) dx. 
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In particular, the constant A in Theorem 5.3 can be taken to be 1. 

8. Let X = [0,1), t{x) = (1/x), x ^ 0, t(0) = 0. Here (x) denotes the fractional 

part of x. With the measure d\x = we have of course fJ-(X) = 1. 

Show that r is a measure-preserving transformation. 

[Hint ： X^fc=l (x+k)(x+k-\-l) = i\x •] 

9. * The transformation r in the previous problem is ergodic. 

10. * The connection between continued fractions and the transformation r(x)= 
(1/x) will now be described. A continued fraction, a 。 + l/(ai + l/a 2 ) …， also 
written as [aoaia 2 ■.. where the aj are positive integers, can be assigned to any 
positive real number x in the following way. Starting with x, we successively 
transform it by two alternating operations: reducing it modulo 1 to lie in [0,1), 
and then taking the reciprocal of that number. The integers aj that arise then 
define the continued fraction of x. 

Thus we set x = ao ro^ where ao = [rr] = the greatest integer in x, and ro G 
[0,1). Next we write 1/ro = ai + ri, with ai = [1/r。], ri G [0,1), to obtain suc¬ 
cessively 1/r-n-i = a n r n , where a n = [l/r n -i], r n G [0,1). If r n = 0 for some n, 
we write ak = 0 for all k > n, and say that such a continued fraction terminates. 

Note that if 0 < a: < 1, then ro = re and a\ = [1/a:], while r\ = 〈 1/x〉= t(x). 
More generally then, ak(x) = [l/r fe_1 (ic)] = air k ~ 1 (x). The following properties 
of continued fractions of positive real numbers x are known: 

(a) The continued fraction of x terminates if and only if x is rational. 

(b) If x = [aoai … a n … ],and xn = [aoai - - - ajvOO … ],then xn — x bs N — 
oo. The sequence {xn} gives essentially an optimal approximation of x by 
rationals. 

(c) The continued fraction is periodic, that is, ak-\-N = dk for some > 1, and 
all sufficiently large k, if and only if x is an algebraic number of degree < 2 
over the rationals. 

(d) One can conclude that a i+ a 2 +---+a n qq as n qq f or almost every x. In 
particular, the set of numbers x whose continued fractions [aoai • • • a n ■ • ■] 
are bounded has measure zero. 

[Hint: For (d) apply a consequence of the pointwise ergodic theorem, which is as 
follows: Suppose / > 0, and f f dfi = oo. If r is ergodic, then f(r k (x)) —>■ 

oo for a.e. a: as m —>• oo. In the present case take f(x) = [1/a:].] 




HausdorfF Measure and 
Fractals 


Caratheodory developed a remarkably simple general¬ 
ization of Lebesgue’s measure theory which in particu¬ 
lar allowed him to define the p-dimensional measure of 
a set in ^-dimensional space. In what follows, I present 
a small addition.... a clarification of p-dimensional 
measure that leads immediately to an extension to 
non-integral p, and thus gives rise to sets of fractional 
dimension. 

F. Hausdorff, 1919 


I coined fractal from the Latin adjective fractus. The 
corresponding Latin verb frang ere means to “break” ： 
to create irregular fragments. 

B. Mandelbrot, 1977 


The deeper study of the geometric properties of sets often requires 
an analysis of their extent or “mass” that goes beyond what can be 
expressed in terms of Lebesgue measure. It is here that the notions 
of the dimension of a set (which can be fractional) and an associated 
measure play a crucial role. 

Two initial ideas may help to provide an intuitive grasp of the concept 
of the dimension of a set. The first can be understood in terms of how 
the set replicates under scalings. Given the set E 1 , let us suppose that 
for some positive number n we have that nE = 五丄 U ... U E m , where the 
sets Ej are m essentially disjoint congruent copies of E. Note that if 
E were a line segment this would hold with m = n\ \i E were a square, 
we would have m = n 2 ; if E were a cube, then m = n 3 ; etc. Thus, more 
generally, we might be tempted to say that E has dimension a if m = n a . 
Observe that if E is the Cantor set C in [0,1], then 3C consists of 2 copies 
of C, one in [0,1] and the other in [2,3]. Here n = 3, m = 2, and we would 
be led to conclude that log 2/log 3 is the dimension of the Cantor set. 
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Another approach is relevant for curves that are not necessarily rec¬ 
tifiable. Start with a curve T = {^(t) : a <t <b}, and for each e > 0 
consider polygonal lines joining 7 ⑷ to 7 ( 6 ), whose vertices lie on suc¬ 
cessive points of r, with each segment not exceeding e in length. Denote 
by #(e) the least number of segments that arise for such polygonal lines. 
If #(e) ^ e _1 as e ^ 0, then T is rectifiable. However, #(e) may well 
grow more rapidly than e — 1 as e —>• 0. If we had #(e) s e _a , 1 < a, 
then, in the spirit of the previous example, it would be natural to say 
that r has dimension a. These considerations have even an interest in 
other parts of science. For instance, in studying the question of determin¬ 
ing the length of the border of a country or its coastline, L.F. Richardson 
found that the length of the west coast of Britain obeyed the empirical 
law #(e) s e _a , with a approximately 1.5. Thus one might conclude 
that the coast has fractional dimension! 

While there are a number of different ways to make some of these 
heuristic notions precise, the theory that has the widest scope and great¬ 
est flexibility is the one involving Hausdorff measure and Hausdorff di¬ 
mension. Probably the most elegant and simplest illustration of this 
theory can be seen in terms of its application to a general class of self¬ 
similar sets, and this is what we consider first. Among these are the 
curves of von Koch type, and these can have any dimension between 1 
and 2 . 

Next, we turn to an example of a space-filling curve, which, broadly 
speaking, falls under the scope of self-replicating constructions. Not 
only does this curve have an intrinsic interest, but its nature reveals the 
important fact that from the point of view of measure theory the unit 
interval and the unit square are the same. 

Our final topic is of a somewhat different nature. It begins with the 
realization of an unexpected regularity that all subsets of (of finite 
Lebesgue measure) enjoy, when d> 3. This property fails in two di¬ 
mensions, and the key counter-example is the Besicovitch set. This set 
appears also in a number of other problems. While it has measure zero, 
this is barely so, since its Hausdorff dimension is necessarily 2. 

1 Hausdorff measure 

The theory begins with the introduction of a new notion of volume or 
mass. This “measure” is closely tied with the idea of dimension which 
prevails throughout the subject. More precisely, following Hausdorff, 
one considers for each appropriate set E and each a > 0 the quantity 
m a (E), which can be interpreted as the a-dimensional mass of E among 
sets of dimension a, where the word “dimension” carries (for now) only 
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an intuitive meaning. Then, if a is larger than the dimension of the set 
E, the set has a negligible mass, and we have m a (E) = 0. If a is smaller 
than the dimension of E, then E is very large (comparatively), hence 
m a (E) = oo. For the critical case when a is the dimension of E, the 
quantity m a (E) describes the actual a-dimensional size of the set. 

Two examples, to which we shall return in more detail later, illustrate 
this circle of ideas. 

First, recall that the standard Cantor set C in [0,1] has zero Lebesgue 
measure. This statement expresses the fact that C has one-dimensional 
mass or length equal to zero. However, we shall prove that C has a 
well-defined fractional Hausdorff dimension of log 2/log 3, and that the 
corresponding Hausdorff measure of the Cantor set is positive and finite. 

Another illustration of the theory developed below consists of starting 
with r, a rectifiable curve in the plane. Then T has zero two-dimensional 
Lebesgue measure. This is intuitively clear, since r is a one-dimensional 
object in a two-dimensional space. This is where the Hausdorff measure 
comes into play: the quantity mi(r) is not only finite, but precisely equal 
to the length of T as we defined it in Section 3.1 of Chapter 3. 

We first consider the relevant exterior measure, defined in terms of 
coverings, whose restriction to the Borel sets is the desired Hausdorff 
measure. 

For any subset E of we define the exterior a-dimensional Haus- 
dorff measure of E by 

( oo 

m* (S) = lim inf < ^^(diam Fk) a : E C diam < 5 all k 

\ k k=l 

where diam S denotes the diameter of the set 5, that is, diam S = 
sup{|x — y\ : x, ?/ G S}. In other words, for each 5 > 0 we consider covers 
of E by countable families of (arbitrary) sets with diameter less than 5, 
and take the infimum of the sum ^ fc (diam Fk) a . We then define m* (E 1 ) 
as the limit of these infimums as 5 tends to 0. We note that the quantity 

( oo 

^(diam F k ) a : £； c |J F fe , diam F k < 5 all k 
i. k k=l 

is increasing as 5 decreases, so that the limit 

mUE) = lira K(E) 

o —^0 

exists, although m^(E) could be infinite. We note that in particu¬ 
lar, one has H^(E) < m^(E) for all 5 > 0. When defining the exte¬ 
rior measure m^(E) it is important to require that the coverings be of 
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sets of arbitrarily small diameters; this is the thrust of the definition 
m^(E) = lim^^oThis requirement, which is not relevant for 
Lebesgue measure, is needed to ensure the basic additive feature stated 
in Property 3 below. (See also Exercise 12.) 

Scaling is the key notion that appears at the heart of the definition of 
the exterior Hausdorff measure. Loosely speaking, the measure of a set 
scales according to its dimension. For instance, if T is a one-dimensional 
subset of say a smooth curve of length L, then rT has total length 
rL. If Q is a cube in M d , the volume of rQ is r d |Q|. This feature is 
captured in the definition of exterior Hausdorff measure by the fact that 
if the set F is scaled by r, then (diam F) a scales by r a . This key idea 
reappears in the study of self-similar sets in Section 2.2. 


We begin with a list of properties satisfied by the Hausdorff exterior 
measure. 

Property 1 (Monotonicity) If E\ C E 2 , then m^(Ei) < m* (^). 

This is straightforward, since any cover of E 2 is also a cover of E\. 

Property 2 (Sub-additivity) < Y^=i m a{ E j) f or an V 

countable family {Ej} of sets in 

For the proof, fix 5, and choose for each j a cover of Ej by- 

sets of diameter less than 5 such that ^ fc (diam Fj^) a < + e/2- 7 . 

Since k Fj^ is a cover of E by sets of diameter less than 5, we must 
have 

00 

j=l 

00 

j=l 

Since e is arbitrary, the inequality H s a (E) < m* a {Ej) holds, and we let 
5 tend to 0 to prove the countable sub-additivity of m* . 

Property 3 If d(E 1 ,E 2 ) > 0, then U E 2 ) = m* (^i) + m* (E 2 ). 

It suffices to prove that m* (Si U E 2 ) > m* (£^ 1 ) + m* (S 2 ) since the re¬ 
verse inequality is guaranteed by sub-additivity. Fix e > 0 with e < 
d (五 1 , 五 2 ). Given any cover of E\ U E 2 with sets Fi, F 2 ..., of diame¬ 
ter less than 5, where 5 < e, we let 


Fj = Ei n Fj and F 1 - 1 = E 2 H Fj . 
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Then {Fj} and {F r -} are covers for E\ and 五 2 ， respectively, and are 
disjoint. Hence, 

y^(diam Fj) a + ^^(diam F^) 0 - < ^^(diam Fk) a . 

j i k 

Taking the infimum over the coverings, and then letting 5 tend to zero 
yields the desired inequality. 

At this point, we note that m* satisfies all the properties of a metric 
Caratheodory exterior measure as discussed in Chapter 6. Thus m* 
is a countably additive measure when restricted to the Borel sets. We 
shall therefore restrict ourselves to Borel sets and write m a (E) instead 
of m* (S). The measure m a is called the a-dimensional Hausdorff 
measure. 

Property 4 If {Ej} is a countable family of disjoint Borel sets, and 
E = Ej, then 

00 

爪 = 〉: ) • 

For what follows in this chapter, the full additivity in the above prop¬ 
erty is not needed, and we can manage with a weaker form whose proof 
is elementary and not dependent on the developments of Chapter 6. (See 
Exercise 2.) 

Property 5 Hausdorff measure is invariant under translations 
m a (E h) = m a (E) for all h G 

and rotations 

mJjE) = m a (E )， 

where r is a rotation in 

Moreover, it scales as follows: 

m a (XE) = X a m a (E) for all A > 0. 

These conclusions follow once we observe that the diameter of a set S 
is invariant under translations and rotations, and satisfies diam(AS , )= 
Adiam(5) for A > 0. 

We describe next a series of properties of Hausdorff measure, the first 
of which is immediate from the definitions. 


328 


Chapter 7. HAUSDORFF MEASURE AND FRACTALS 


Property 6 The quantity mo(E) counts the number of points in E, 
while mi(E) = m(E) for all Bor el sets E CM.. (Here m denotes the 
Lebesgue measure on M .」 

In fact, note that in one dimension every set of diameter 5 is contained in 
an interval of length 5 (and for an interval its length equals its Lebesgue 
measure). 

In general, d-dimensional Hausdorff measure in is, up to a constant 
factor, equal to Lebesgue measure. 

Property 7 If E is a Borel subset of R d ， then Cdrrid(E) = m(E) for 
some constant Cd that depends only on the dimension d. 

The constant Cd equals m(S)/(diam B) d , for the unit ball B; note that 
this ratio is the same for all balls B in and so q = Vd/2 d (where Vd 
denotes the volume of the unit ball). The proof of this property relies on 
the so-called iso-diametric inequality, which states that among all sets of 
a given diameter, the ball has largest volume. (See Problem 2.) Without 
using this geometric fact one can prove the following substitute. 

Property 7 1 If E is a Borel subset of and m(E) is its Lebesgue 
measure, then rrid(E) ^ m(E )， in the sense that 

c d m d (E) < m(E) < 2 d c d m d (E). 

Using Exercise 26 in Chapter 3 we can find for every e, 5 > 0, a covering 
of E by balls {Bj}, such that diam Bj <5, while m(Bj) < m(E) + e. 
Now, 

^d(E) < L(diam 马 ) d 二 < c^\m(E)+e). 

j 3 

Letting 8 and e tend to 0, we get rrid(E) < c^ 1 m(E). For the reverse 
direction, let E C Fj be a covering with ^^(diam Fj) d < rrid(E) + e. 
We can always find closed balls Bj centered at a point of Fj so that 
Bj D Fj and diam Bj = 2 diam Fj. However, m(E) < since 

E C Bj, and the last sum equals 

^ c d (diam Bj) d = 2 d c d ^(diam Fj) d < 2 d c d (m d (E) + e). 

Letting e ^ 0 gives m(E) < 2 d Cdmd(E). 

Property 8 Ifm^(E) < oo and /3 > a, then m^(E) = 0. Also, > 

0 and (3 < a, then rrip(E) = oo. 
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Indeed, if diam F < 5, and /? > a, then 

(diam Ff 二 (diam F) /3_Q (diam F) a < 5 /3_Q (diam F) a . 
Consequently 

H 5 0 {E) < 8 0 ~ a n 5 a {E) < 5 0 _ a m* a (E). 

Since m^(E) < oo and /3 — a > 0, we find in the limit as 5 tends to 0, 
that rrip(E) = 0. 

The contrapositive gives m^(E) = oo whenever m^(E) > 0 and f3 < a. 

We now make some easy observations that are consequences of the 
above properties. 

1. If / is a finite line segment in M d , then 0 < m\{I) < oo. 

2. More generally, if Q is a fc-cube in (that is, Q is the product of 
k non-trivial intervals and d — k points), then 0 < rrik(Q) < oo. 

3. If (9 is a non-empty open set in then m a (0) = oo whenever 
a < d. Indeed, this follows because rrid{0) > 0. 

4. Note that we can always take a < d. This is because when a > d, 
m a vanishes on every ball, and hence on all of 


2 Hausdorff dimension 


Given a Borel subset E of we deduce from Property 8 that there 
exists a unique a such that 


771/3 ⑻ = ? 


if /? < a, 
if a < /3. 


In other words, a is given by 


a = sup{/3 : rri/ 3 (E) = oo} = inf{/3 : mp{E) = 0 }. 

We say that E has Hausdorff dimension a, or more succinctly, that 
E has dimension a. We shall write a = dim 五 . At the critical value a 
we can say no more than that in general the quantity m a (E) satisfies 
0 < m a (E) < oo. If E is bounded and the inequalities are strict, that is, 
0 < m a (E) < oo, we say that E has strict Hausdorff dimension a. 
The term fractal is commonly applied to sets of fractional dimension. 

In general, calculating the Hausdorff measure of a set is a difficult 
problem. However, it is possible in some cases to bound this measure 
from above and below, and hence determine the dimension of the set in 
question. A few examples will illustrate these new concepts. 
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2.1 Examples 
The Cantor set 

The first striking example consists of the Cantor set C, which was con¬ 
structed in Chapter 1 by successively removing the middle-third intervals 
in [0,1]. 


Theorem 2.1 The Cantor set C has strict Hausdorff dimension a = 
log 2/log 3. 

The inequality 

m a (C) < 1 

follows from the construction of C and the definitions. Indeed, recall from 
Chapter 1 that C = 门 Cfc, where each Ck is a finite union of 2 k intervals of 
length 3~ k . Given 5 > 0, we first choose K so large that 3~ K < 5. Since 
the set Ck covers C and consists of 2 K intervals of diameter 3~ K < 5, 
we must have 

n s a (c) < 2 K (3 - K ) a . 

However, a satisfies precisely 3 a = 2, hence 2 K {3~ K ) a = 1, and therefore 
m^C) < 1 . 

The reverse inequality, which consists of proving that 0 < m Q; (C), re¬ 
quires a further idea. Here we rely on the Cantor-Lebesgue function, 
which maps C surjectively onto [0,1]. The key fact we shall use about 
this function is that it satisfies a precise continuity condition that reflects 
the dimension of the Cantor set. 

A function / defined on a subset E of satisfies a Lipschitz con¬ 
dition on E if there exists M > 0 such that 

\f{x) - f{y)\ < M\x - y\ for all x,y e E. 

More generally, a function / satisfies a Lipschitz condition with ex¬ 
ponent 7 (or is Holder 7) if 

\f(x) - f(y)\ < M\x - J/| 7 for all x,y e E. 

The only interesting case is when 0 < 7 < 1. (See Exercise 3.) 

Lemma 2.2 Suppose a function f defined on a compact set E satisfies 
a Lipschitz condition with exponent 7. Then 

(i) rnp(f(E)) < M^m a (E) if 0 = a/7. 
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\F n (x) - F n (y)\ < 


Moreover, the approximating sequence also satisfies \F(x) — F n {x)\ < 
l/2 n . These two estimates together with an application of the triangle 
inequality give 

|F ㈤- F(y)\ < \F n (x) - F n (y)\ + \F(x) - F n (x)\ + \F(y) - F n (y)\ 

- { 1 ) |x_y| + ^ - 

Having fixed x and we then minimize the right hand side by choosing 
n so that both terms have the same order of magnitude. This is achieved 
by taking n so that 3 n \x — y\ is between 1 and 3. Then, we see that 

\F(x) - F(y)\ < c2~ n = c(3~ n y < M\x - y\\ 

since 3 7 = 2 and 3 _n is not greater than \x — y\. This argument is re¬ 
peated in Lemma 2.8 below. 

With E = C, f the Cantor-Lebesgue function, and a = 7 = log 2/log 3, 
the two lemmas give 


(ii) dim f(E) < ^ dimE. 

Proof. Suppose {Fk} is a countable family of sets that covers E. 
Then {f(E fl Fk)} covers f(E) and, moreover, f(E D Fk) has diameter 
less than M(diam Fk) 1 . Hence 

E(diam f(En F k )) a ^ < M a h ^(diam F k ) a , 

k k 

and part (i) follows. This result now immediately implies conclusion (ii). 


Lemma 2.3 The Cantor-Lebesgue function F on C satisfies a Lipschitz 
condition with exponent 7 = log 2/log 3. 

Proof. The function F was constructed in Section 3.1 of Chapter 3 as 
the limit of a sequence {_F n } of piecewise linear functions. The function 
F n increases by at most 2 _n on each interval of length 3 _n . So the slope 
of F n is always bounded by (3/2) n , and hence 



mi([ 0 ,l]) < 
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Thus m a (C) > 0, and we find that dimC = log2/log3. 

The proof of this example is typical in the sense that the inequal¬ 
ity m a (C) < oo is usually easier to obtain than 0 < m a (C). Also, with 
some extra effort, it is possible to show that the log 2 / log 3-dimensional 
Hausdorff measure of C is precisely 1 . (See Exercise 7.) 


Rectifiable curves 

A further example of the role of dimension comes from looking at con¬ 
tinuous curves in R d . Recall that a continuous curve 7 : [a, b] R d is 
said to be simple if 7 (^ 1 ) ^ 7 (^ 2 ) whenever t\ — 亡 2 , and quasi-simple 
if the mapping 1 1 — z{t) is injective for t in the complement of finitely 
many points. 

Theorem 2.4 Suppose the curve 7 is continuous and quasi-simple. Then 
7 is rectifiable if and only if T = {^(t) : a <t < b} has strict Hausdorff 
dimension one. Moreover, in this case the length of the curve is precisely 
its one-dimensional measure mi(T). 

Proof. Suppose to begin with that r is a rectifiable curve of length L, 
and consider an arc-length parametrization 7 such that T = {^(t) : 0 < 
t < L). This parametrization satisfies the Lipschitz condition 


丨 7 (亡 1 ) - 7(^)1 < \ti - t 2 \- 


This follows since \ti — is the length of the curve between t\ and 亡 2 , 
which is greater than the distance from 7 (^ 1 ) to 7(^2) - Since 7 satisfies 
the conditions of Lemma 2.2 with exponent 1 and M = 1, we find that 


^i(r) < L. 

To prove the reverse inequality, we let a = 心〈尤 1 < ... 〈亡 iv = 6 denote 
a partition of [a, b] and let 


r j = { 7 ⑷ -tj <t< 
so that T = 1 Tj, and hence 

N-l 

mi ( r ) = 

j=0 


by an application of Property 4 of the Hausdorff measure and the fact 
that r is quasi-simple. Indeed, by removing finitely many points the 
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union Tj becomes disjoint, while the points removed clearly have 

zero mi-measure. We next claim that mi(Tj) > where £j is the dis¬ 
tance from 7 ( 心 ） to 7 (~ +1 )， that is, £j = | 7 (~ +1 ) — 7 (~)|. To see this, 
recall that Hausdorff measure is rotation-invariant, and introduce new or¬ 
thogonal coordinates x and y such that [ 7 (^), 7 (^ + 1 )] is the segment 
[0, £j] on the x-axis. The projection n(x ， y) = x satisfies the Lipschitz 
condition 

\7r(P)-7r(Q)\<\P-Ql 

and clearly the segment [ 0 , £j] on the x-axis is contained in the image 
7r(Tj). Therefore, Lemma 2.2 guarantees 

and thus mi(T) > Since by definition the length L of T is the 

supremum of the sums ^ £j over all partitions of [a, 6 ], we find that 
^i(r) > L, as desired. 

Conversely, if T has strict Hausdorff dimension 1, then mi(T) < 00 , 
and the above argument shows that T is rectifiable. 

The reader may note the resemblance of this characterization of rec- 
tifiability and an earlier one in terms of Minkowski content, given in 
Chapter 3. In this connection we point out that there is a different 
notion of dimension that is sometimes used instead of Hausdorff dimen¬ 
sion. For a compact set E, this dimension is given in terms of the size 
of = {x G : d(x, E 1 ) < 5} as 5 —> 0. One observes that if 五 is a 
fc-dimensional cube in then m(E s ) < c5 d ~ k as 5 — > 0, with m the 

Lebesgue measure of With this in mind, the Minkowski dimen¬ 
sion of E is defined by 

inf {/3 : m(E 5 ) = 0{5 d ~^) as 5 —»• 0}. 

One can show that the Hausdorff dimension of a set does not exceed its 
Minkowski dimension, but that equality does not hold in general. More 
details may be found in Exercises 17 and 18. 

The Sierpinski triangle 

A Cantor-like set can be constructed in the plane as follows. We begin 
with a (solid) closed equilateral triangle Sq, whose sides have unit length. 
Then, as a first step we remove the shaded open equilateral triangle 
pictured in Figure 1. 
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Figure 1 . Construction of the Sierpinski triangle 


This leaves three closed triangles whose union we denote by Si ， Each 
triangle is half the size of the original (or parent) triangle So, and these 
smaller closed triangles are said to be of the first generation: the trian¬ 
gles in Si are the children of the parent Sq. In the second step, we repeat 
the process in each triangle of the first generation. Each such triangle 
has three children of the second generation. We denote by S 2 the union 
of the three triangles in the second generation. We then repeat this pro¬ 
cess to find a sequence Sk of compact sets which satisfy the following 
properties: 

(a) Each Sk is a union of 3 fc closed equilateral triangles of side length 
2- k . (These are the triangles of the fc th generation.) 

(b) {Sk} is a decreasing sequence of compact sets; that is, Sk-\-i C Sk 
for all fc > 0. 

The Sierpinski triangle is the compact set defined by 

00 

s^f]s k . 

k=0 

Theorem 2.5 The Sierpinski triangle S has strict Hausdorff dimension 
a = log 3/log 2. 

The inequality m a (S) < 1 follows immediately from the construction. 
Given 5 > 0, choose K so that 2~ K < 5. Since the set Sk covers S and 
consists of 3^ triangles each of diameter 2~ K < <5, we must have 

n s a (S) < 3 K (2~ K )° i . 

But since 2 Q = 3, we find < 1, hence m a (<S) < 1. 

The inequality m a (S) > 0 is more subtle. For its proof we need to fix 
a special point in each triangle that appears in the construction of S. 
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We choose to call the lower left vertex of a triangle the vertex of that 
triangle. With this choice there are 3 k vertices of the fc th generation. 
The argument that follows is based on the important fact that all these 
vertices belong to S. 

Suppose S C U 二 i Fj, with diam Fj <5. We wish to prove that 
y^(diam Fj) a > c > 0 

3 

for some constant c. Clearly, each Fj is contained in a ball of twice the 
diameter of Fj, so upon replacing 26 by 8 and noting that S is compact, 
it suffices to show that if 5 C Bj, where B = {Bj}^ =1 is a finite 
collection of balls whose diameters are less than <5, then 

N 

y^(diam Bj) a > c > 0. 
j=i 

Suppose we have such a covering by balls. Consider the minimum diam¬ 
eter of the Bj, and choose k so that 

2~ k < min diam Bn < 2 _fc+1 . 

— l<j<N J 

Lemma 2.6 Suppose B is a ball in the covering B that satisfies 

2—€ < diam B < 2~ £+1 for some £ < k. 

Then B contains at most c3 k ~^ vertices of the k th generation. 

In this chapter, we shall continue use the common practice of denoting 
by c, c’， … generic constants whose values are unimportant and may 
change from one usage to another. We also use B to denote that 
the quantities A and B are comparable, that is, cB < A < c r for 
appropriate constants c and d. 

Proof of Lemma 2.6. Let B* denote the ball with same center as B but 
three times its diameter, and let be a triangle of the k th generation 
whose vertex v lies in B. If denotes the triangle of the £ th generation 
that contains △&, then since diam B > 2 - ^, 

G C C B*, 

as shown in Figure 2. 

Next, there is a positive constant c such that B* can contain at most 
c distinct triangles of the £ th generation. This is because triangles of the 
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£th generation have disjoint interiors and area equal to c’4 乂 while B* 
has area at most equal to c’’4 _ ( Finally, each contains 3 k ~ £ triangles 
of the fc th generation, hence B can contain at most c3 k ~ £ vertices of 
triangles of the k th generation. 

To complete the proof that [ 二 “diam Bj) a > c > 0, note that 
N 

^(diam Bj) a 

j=i i 

where denotes the number of balls in B that satisfy < diam Bj < 
2 - 朴 i By the lemma, we see that the total number of vertices of triangles 
in the k th generation that can be covered by the collection B can be no 
more than Since all S k vertices of triangles in the k th 

generation belong to 5, and all vertices of the fc th generation must be 
covered, we must have N^3 k ~ £ > 3 k . Hence 

i 

It now suffices to recall the definition of a which guarantees 2~ £a = 3 - ^, 
and therefore 

N 

y^(diam Bj) a > c, 
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as desired. 

We give a final example that exhibits properties similar to the Cantor 
set and Sierpinski triangle. It is the curve discovered by von Koch in 1904. 


The von Koch curve 

Consider the unit interval Kq = [0,1], which we may think of as lying 
on the x-axis in the xy-plane. Then consider the polygonal path K\ 
illustrated in Figure 3, which consists of four equal line segments of 
length 1/3. 





Figure 3. The first few stages in the construction of the von Koch curve 


Let K\{t)^ for 0 < t < 1, denote the parametrization of K\ that has 
constant speed. In other words, as t travels from 0 to 1/4, the point 
Ki{t) travels on the first line segment. As t travels from 1/4 to 1/2, the 
point K\(t) travels on the second line segment, and so on. In particular, 
we see that Ki(£/4) for 0 < £ < 4 correspond to the five vertices of K\. 

At the second stage of the construction we repeat the process of re¬ 
placing each line segment in stage one by the corresponding polygonal 
line. We then obtain the polygonal curve K 2 illustrated in Figure 3. It 
has 16 = 4 2 segments of length 1/9 = 3 -2 . We choose a parametrization 
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K 2 (t) (0 < t < 1 ) of if 2 that has constant speed. Observe that -K* 2 (^/ 4 2 ) 
for 0 < ^ < 4 2 gives all vertices of K 2 , and that the vertices of K\ belong 
to K 2 , with 

K 2 {£/A) = i^("4) for 0 < ^ < 4. 

Repeating this process indefinitely, we obtain a sequence of continuous 
polygonal curves {Kj}, where Kj consists of 4 J segments of length 3 _J 
each. If Kj (t) (0 < t < 1) is the parametrization of Kj that has constant 
speed, then the vertices are precisely at the points i^(£/4 J ), and 

K jf (i/A j ) = Kj(£/A j ) for 0 < £ < ^ 

whenever j f > j. 

In the limit as j tends to infinity, the polygonal lines Kj tend to the 
von Koch curve /C. Indeed, we have 

\Kj + \{t) — Kj(t)\ < 3 _J for all 0 < t < 1 and j > 0. 

This is clear when j = 0, and follows by induction in j when we consider 
the nature of the construction of the j th stage. Since we may write 

j-i 

Kj(t ) 二 K^t) + ⑷ — K 肌 

J = 1 

the above estimate proves that the series 

00 

J = 1 

converges absolutely and uniformly to a continuous function lC(t) that is 
a parametrization of 1C. Besides continuity, the function /C(t) satisfies a 
regularity assumption that takes the form of a Lipschitz condition, as in 
the case of the Cantor-Lebesgue function. 

Theorem 2.7 The function IC(t) satisfies a Lipschitz condition of expo¬ 
nent 7 = log 3/log 4 ， that is: 

\IC(t) — IC(s)\ < M\t — s | 7 for all t,s E [0,1]. 

We have already observed that \Kj + i(t) — Kj{t)\ < 3 _J . Since Kj travels 
a distance of 3 _J in 4 _J units of time, we see that 

\Kj{t)\ S ( 蠢 ) except when t = "4 J . 
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Consequently we must have 

1 心⑷ - ^j( s )\ ^ I 卜 s l. 

Moreover, IC(t) = Ki(t) + — Kj(t)). We now find our¬ 

selves in precisely the same situation as in the proof that the Cantor- 
Lebesgue function satisfies a Lipschitz condition with exponent log 2/ log 3. 
We generalize that argument in the following lemma. 

Lemma 2.8 Suppose {fj} is a sequence of continuous functions on the 
interval [0,1] that satisfy 

- /j0)| < A°\t- s\ for some A> 1, 

and 

- < B~ J for some B > 1. 

Then the limit f{t) = linij^oo fj (t) exists and satisfies 

where 7 = log B/ log(AB). 

Proof. The continuous limit / is given by the uniformly convergent 
series 

00 

f(t ) 二 h(t) + ^2{f k+ i(t) - 九⑴)， 

k=l 

and therefore 

00 00 

\f{t) - /i(i)l<E I 九 +1 ⑷— 九⑴ I <T, B ~ k < cB- j . 

k=j k=j 

The triangle inequality, an application of the inequality just obtained, 
and the inequality in the statement of the lemma give 

\m-f(s)\< % ⑴ — fj(s)\ + K/- m)\ + i(/- ms)\ 

< c(A^\t — s| + 


For a fixed pair of numbers t and s with t + s 、we choose j to minimize 
the sum A^\t — s| + B~K This is essentially achieved by picking j so that 
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two terms A^\t — s\ and B~^ are comparable. More precisely, we choose 
a j that satisfies 


(ABy\t — s| < 1 and 1 < [ABy +1 \t — s\. 

Since \t — s\ < 2 and AB > 1, such a j must exist. The first inequality 
then gives 

A j \t-s\ < B- j , 

while raising the second inequality to the power 7 , and using the fact 
that {AB) 1 = B gives 

1 < B j \t-s\^. 

Thus < \t — s| 7 , and consequently 

1/ ⑴ - f(s)\ < c(A^\t - 5 | + B^)<M\t-s\\ 


as was to be shown. 


In particular, this result with Lemma 2.2 implies that 


dim/C < 


1 log 4 
7 log 3 


To prove that m 7 (/C) > 0 and hence dim/C = log 4 / log 3 requires an ar¬ 
gument similar to the one given for the Sierpinski triangle. In fact, 
this argument generalizes to cover a general family of sets that have a 
self-similarity property. We therefore turn our attention to this general 
theory next. 

Remarks. We mention some further facts about the von Koch curve. 
More details can be found in Exercises 13, 14, and 15 below. 


1. The curve /C is one in a family of similarly constructed curves. For 
each 彳， 1/4 < € < 1/2， consider at the first stage the curve K[(t) 
given by four line segments each of length the first and last on the 
x-axis, and the second and third forming two sides of an isoceles 
triangle whose base lies on the x-axis. (See Figure 4.) The case 
i = 1/3 corresponds to the previously defined von Koch curve. 

Proceeding as in the case £ = 1/3, one obtains a curve /C £ , and it 
can be seen that 


dim(/C’）= 


log 4 
log 1 /^ 
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Thus for every a, 1 < ce < 2, we have a curve of this kind of dimen¬ 
sion a. Note that when £ —^ 1/4 the limiting curve is a straight line 
segment, which has dimension 1. When £ —> 1/2, the limit can be 
seen to correspond to a “space-filling” curve. 

2. The curves 1 1 —^ /C £ (t), 1/4 < £ < 1/2, are each nowhere differen¬ 
tiable. One can also show that each curve is simple when 1/4 < 
t< 1 / 2 . 

2.2 Self-similarity 

The Cantor set C, the Sierpinski triangle 5, and von Koch curve /C all 
share an important property: each of these sets contains scaled copies 
of itself. Moreover, each of these examples was constructed by iterating 
a process closely tied to its scaling. For instance, the interval [0,1/3] 
contains a copy of the Cantor set scaled by a factor of 1/3. The same is 
true for the interval [2/3,1], and therefore 

C = Ci U C 2 , 

where Ci and C 2 are scaled versions of C. Also, each interval [0,1/9], 
[2/9,3/9], [6/9,7/9] and [8/9,1] contains a copy of C scaled by a factor 
of 1/9, and so on. 

In the case of the Sierpinski triangle, each of the three triangles in the 
first generation contains a copy of S scaled by the factor of 1/2. Hence 

= <Sl U U 

where each Sj, j = 1,2,3, is obtained by scaling and translating the 
original Sierpinski triangle. More generally, every triangle in the fc th 
generation is a copy of S scaled by the factor of l/2 fc . 

Finally, each line segment in the initial stage of the construction of the 
von Koch curve gives rise to a scaled and possibly rotated copy of the 
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von Koch curve. In fact 


/C = /Ci U /C2 U /C3 U /C4, 

where ICj , j = 1,2,3,4, is obtained by scaling /C by the factor of 1/3 and 
translating and rotating it. 

Thus these examples each contain replicas of themselves, but on a 
smaller scale. In this section, we give a precise definition of the resulting 
notion of self-similarity and prove a theorem determining the Hausdorff 
dimension of these sets. 

A mapping 5 : is said to be a similarity with ratio r > 0 if 

I 咖 )- *%)l = r\x-y\. 

It can be shown that every similarity of is the composition of a trans¬ 
lation, a rotation, and a dilation by r. (See Problem 3.) 

Given finitely many similarities Si, with the same ratio r, we 
say that the set F C is self-similar if 

F = 5 x(F)U..-U 5 m (F). 

We point out the relevance of the various examples we have already seen. 
When F = C is the Cantor set, there are two similarities given by 

Si(x) = x/3 and ^(x) = x/3 + 2/3 

of ratio 1/3. So m = 2 and r = 1/3. 

In the case of F = S, the Sierpinski triangle, the ratio is r = 1/2 and 
there are m = 3 similarities given by 

s i( x ) = = I + a and S 3 (x) = | + /3. 

Here, a and /3 are the points drawn in the first diagram in Figure 5. 

If = /C, the von Koch curve, we have 

Si{x) = I ， S 2 (x) = + a, S 3 (x) = p _1 | + 13, 


and 


Sa ( x )= 吾 + 7, 
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Figure 5. Similarities of the Sierpinski triangle and von Koch curve 


where p is the rotation centered at the origin and of angle 7 r/ 3 . There 
are m = 4 similarities which have ratio r = 1/3. The points a, /?, and 7 
are shown in the second diagram in Figure 5. 

Another example, sometimes called the Cantor dust P, is another 
two-dimensional version of the standard Cantor set. For each fixed 0 < 
li < 1/2, the set V may be constructed by starting with the unit square 
Q = [0,1] x [0,1]. At the first stage we remove everything but the four 
open squares in the corners of Q that have side length //. This yields a 
union D\ of four squares, as illustrated in Figure 6 . 



Figure 6. Construction of the Cantor dust 


We repeat this process in each sub-square of Di ； that is, we remove 
everything but the four squares in the corner, each of side length fi 2 . 
This gives a union D 2 of 16 squares. Repeating this process, we obtain 
a family Di D D 2 D • ■ ■ D D • ■ • of compact sets whose intersection 
defines the Cantor dust corresponding to the parameter [x. 
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There are here m = 4 similarities of ratio /i given by 
Si(x) = //x, 

5 2 (x) = //x + (0,1 - //), 

S 3 (x) = //x + (1 — /i, 1 - 
5^(0；) = fix (1 — /i, 0). 


It is to be noted that V is the product Q x Q, with the Cantor set 
of constant dissection as defined in Exercise 3, of Chapter 1. Here 

^ = 1 — 2fi. 

The first result we prove guarantees the existence of self-similar sets 
under the assumption that the similarities are contracting, that is, that 
their ratio satisfies r < 1. 

Theorem 2.9 Suppose 5i, ^ 2 ,..., 5 m are m similartities, each with the 
same ratio r that satisfies 0 < r < 1. Then there exists a unique non¬ 
empty compact set F such that 


F = S 1 (F)U---US rn (F). 

The proof of this theorem is in the nature of a fixed point argument. 
We shall begin with some large ball B and iteratively apply the mappings 
Si ， … ， S m . The fact that the similarities have ratio r < 1 will suffice to 
imply that this process contracts to a unique set F with the desired 
property. 

Lemma 2.10 There exists a closed ball B so that Sj{B) C B for all 
j = 1 ， … ， m. 

Proof. Indeed, we note that if 5 is a similarity with ratio r, then 


\s(x)\<\s(x)~sm + \sm 

< r\x\ + \S(0)\. 

If we require that \x\ < R implies \S(x)\ < i?, it suffices to choose R 
so that rR + |5(0)| < i?, that is, R > |S" ⑼ |/(1 — r). In this fashion, 
we obtain for each Sj a ball Bj centered at the origin that satisfies 
Sj(Bj) C Bj . If B denotes the ball among the Bj with the largest radius, 
then the above shows that Sj(B) C B for all j. 

Now for any set A, let S(A) denote the set given by 
S(A) = S 1 (A)U^^US m (A). 
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Note that if A C A\ then S(A) C S(A f ). 

Also observe that while each Sj is a mapping from to M d , the 
mapping S is not a point mapping, but takes subsets of to subsets of 
R d . 

To exploit the notion of contraction with a ratio less than 1, we intro¬ 
duce the distance between two compact sets as follows. For each 5 > 0 
and set A, we let 

A 5 = {x : d(x^ A) < 5}. 

Hence A 5 is a set that contains A but which is slightly larger in terms of 5. 
If A and B are two compact sets, we define the Hausdorff distance as 

dist(^4, B) = inf{5 : B C A s and A C B 5 }. 

Lemma 2.11 The distance function dist defined on compact subsets of 
M. d satisfies 

(i) dist (A, B) = 0 if and only if A = B. 

(ii) dist(^4, B) = dist(B, A). 

(iii) dist(A, B) < dist (A, C) + dist(C, B). 

If Si,, S m are similarities with ratio r, then 

(iv) dist(S(A), S(B)) <rdist(A, B). 

The proof of the lemma is simple and may be left to the reader. 

Using both lemmas we may now prove Theorem 2.9. We first choose 
B asm Lemma 2.10, and let = S k (B), where S k denotes the fc th com¬ 
position of *§, that is, S k = o S with S 1 = S. Each is compact, 
non-empty, and Fk C Fk-i, since S(B) C B. If we let 

oo 

f= n a ， 

k=l 

then F is compact, non-empty, and clearly S(F) = F, since applying S 
to p|^ =1 Fk yields P|^ 2 Fk, which also equals F. 

Uniqueness of the set F is proved as follows. Suppose G is another 
compact set so that S(G) = G. Then, an application of part (iv) in 
Lemma 2.11 yields dist(F, G) < rdist(F, G). Since r < 1, this forces 
dist(F, G) = 0, so that F = G, and the proof of Theorem 2.9 is com¬ 
plete. 
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Under an additional technical condition, one can calculate the precise 
Hausdorff dimension of the self-similar set F. Loosely speaking, the 
restriction holds if the sets 5i(F),..., Sm(F) do not overlap too much. 
Indeed, if these sets were disjoint, then we could argue that 

m 

m a (F) = y^m a (Sj[F)). 
j=i 

Since each Sj scales by r, we would then have m a (Sj(F)) = r a m a (F). 
Hence 


m a (F) = mr Q! m Q! (F). 

If m a (F) were finite, then we would have that mr a = 1; thus 

logm 

a logl/r. 

The restriction we impose is as follows. We say that the similarities 
5i,..., S m are separated if there is an bounded open set O so that 

OdWCOu — uKO )， 

and the Sj(0) are disjoint. It is not assumed that O contains F. 

Theorem 2.12 Suppose 5i, ^ 2 , ..., S m are m separated similarities with 
the common ratio r that satisfies 0 < r < 1. Then the set F has Haus¬ 
dorff dimension equal to logm/ log(l/r). 

Observe first that when F is the Cantor set we may take O to be 
the open unit interval, and note that we have already proved that its 
dimension is log 2/ log 3. For the Sierpinski triangle the open unit triangle 
will do, and dim 5 = log 3/log 2. In the example of the Cantor dust the 
open unit square works, and dimD = log m/ log p _1 . Finally, for the von 
Koch curve we may take the interior of the triangle pictured in Figure 7, 
and we will have dim/C = log4 / log3. 

We now turn to the proof of Theorem 2.12, which will follow the same 
approach used in the case of the Sierpinski triangle. If a = log m/ log(l/r), 
we claim that m a (F) < oo, hence dimi^ < a. Moreover, this inequality 
holds even without the separation assumption. Indeed, recall that 


Fk 二 S k (B) 
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Figure 7. Open set in the separation of the von Koch similarities 


and S k (B) is the union of m k sets of diameter less than cr k (with c = 
diam B), each of the form 

S ni Q S n2 o • • • o S nk (_B), where 1 < rii < m and 1 < i < k. 
Consequently, if cr k < 5, then 

n 5 a (F)< J2 (diamS ni o...oS nk (B)) a 

71l ，…， Tlfc 

< c’m fc r a/c 

< 

since mr a = 1, because a = log m/ log(l/r). Since c! is independent of 
5, we get m a (F) < d. 

To prove m^F) > 0, we now use the separation condition. We argue 
in parallel with the earlier calculation of the Hausdorff dimension of the 
Sierpinski triangle. 

Fix a point x in F. We define the “vertices” of the fc th generation as 
the m k points that lie in F and are given by 

S ni o ... o S nk (x), where 1 < ni < m,..., 1 < < m. 

Each vertex is labeled by (ni,..., n^). Vertices need not be distinct, so 
they are counted with their multiplicities. 

Similarly, we define the “open sets” of the A: th generation to be the m k 
sets given by 

S ni o ... o S nk ((!?), where 1 < ni < m,..., 1 < < m, 

and where O is fixed and chosen to satisfy the separation condition. 
Such open sets are again labeled by multi-indices (ni, 77 - 2 ,..., rik) with 
1 < nj < m, 1 < j < A:. 
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Then the open sets of the k th generation are disjoint, since those of 
the first generation are disjoint. Moreover if fc > £, each open set of the 
£ th generation contains m k ~^ open sets of the k th generation. 

Suppose r is a vertex of the k th generation, and let 0(v) denote the 
open set in the fc th generation which is associated to v, that is, v and 
0(v) carry the same label (ni,ri 2 ,..., n^). Since x is at a fixed distance 
from the original open set (9, and O has a finite diameter, we find that 

(a) d(v, 0{y)) < cr k . 

(b) c f r k < diam 0{v) < cr k . 

As in the case of the Sierpinski triangle, it suffices to prove that if 
B = {Bj}^ =1 is a finite collection of balls whose diameters are less than 
5 and whose union covers F, then 

N 

y^(diam >oo. 

j=i 

Suppose we have such a covering by balls, and choose so that 

r k < min diam S 7 < r fc_1 . 

— iX_/v J 

Lemma 2.13 Suppose B is a ball in the covering B that satisfies 

〆 < diam B < r^~ 1 for some £ < k. 

Then B contains at most cm k ~^ vertices of the k th generation. 

Proof. If r is a vertex of the k th generation with v G B, and 0(y) 
denotes the corresponding open set of the fc th generation, then, for some 
fixed dilate B* of B, properties (a) and (b) above guarantee that 0{y) C 
B*, and B* also contains the open set of generation t that contains 0{v). 

Since B* has volume cr d ^, and each open set in the I th generation has 
volume ^ r di (by property (b) above), B* can contain at most c open 
sets of generation L Hence B* contains at most cm k ~ £ open sets of the 
A: th generation. Consequently, B can contain at most cm k ~ l vertices of 
the fc th generation, and the lemma is proved. 

For the final argument, let denote the number of balls in B so that 

< diam Bj < 〆 - 1 . 

By the lemma, we see that the total number of vertices of the fc th gen¬ 
eration that can be covered by the collection B can be no more than 
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cUim k - e . Since all m k vertices of the k th generation belong to F, 
we must have > m k ， and hence 

〉 : N^m~^ > c. 
i 

The definition of a gives r^ a = and therefore 
N 

y^(diam Bj) a > ^ N £ r ia > c, 
j=i t 

and the proof of Theorem 2.12 is complete. 

3 Space-filling curves 

The year 1890 heralded an important discovery: Peano constructed a 
continuous curve that filled an entire square in the plane. Since then, 
many variants of his construction have been given. We shall describe here 
a construction that has the feature of elucidating an additional significant 
fact. It is that from the point of measure theory, speaking broadly, the 
unit interval and unit square are “isomorphic.” 

Theorem 3.1 There exists a curve 1 1 —> V{t) from the unit interval to 
the unit square with the following properties: 

(i) V maps [0,1] to [0,1] x [0,1] continuously and surjectively. 

(ii) V satisfies a Lips chit z condition of exponent 1/2 ， that is, 

\V{t)-V{s)\ < M\t-s\ 1/2 . 

(iii) The image under V of any sub-interval [a, b] is a compact subset of 
the square of (two-dimensional) Lebesgue measure exactly b — a. 

The third conclusion can be elaborated further. 

Corollary 3.2 There are subsets Z\ C [0,1] and Z 2 C [0,1] x [0, 1], each 
of measure zero, such that V is bijective from 

[0,1] — Z\ to [0,1] x [0,1] — 

and measure preserving. In other words, E is measurable if and only if 
V[E) is measurable, and 


mi(E) = rri 2 (V(E)). 
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Here m\ and m 2 denote the Lebesgue measures in M 1 and R 2 , respec¬ 
tively. 

We shall call the function 1 1 — >• V(t) the Peano mapping. Its image 
is called the Peano curve. 

Several observations help clarify the nature of the conclusions of the 
theorem. Suppose that F : [0,1] —>• [0,1] x [0,1] is continuous and sur¬ 
jective. Then: 

(a) F cannot be Lipschitz of exponent 7 > 1/2. This follows at once 
from Lemma 2.2, which states that 

dim F([ 0 , 1 ]) < — dim[ 0 , 1 ], 

7 

so that 2 < I /7 as desired. 

(b) F cannot be injective. Indeed, if this were the case, then the in¬ 
verse G of F would exist and would be continuous. Given any two 
points a 7 ^ 6 in [ 0 , 1 ], we would get a contradiction by looking at 
two distinct curves in the square that join F(a) and F(b), since the 
image of these two curves under G would have to intersect at points 
between a and b. In fact, given any open disc D in the square, there 
always exists x e D so that F(t) = F(s) = x yet t ♦ s. 

The proof of Theorem 3.1 will follow from a careful study of a natu¬ 
ral class of mappings that associate sub-squares in [ 0 , 1 ] x [ 0 , 1 ] to sub¬ 
intervals in [0,1]. This implements the approach implicit in Hilbert’s 
iterative procedure, which he set forth in the first three stages in Fig¬ 
ure 8 . 



Figure 8. Construction of the Peano curve 


We turn now to the study of the general class of mappings. 
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3.1 Quartic intervals and dyadic squares 

The quartic intervals arise when [0,1] is successively sub-divided by 
powers of 4. For instance, the first generation quartic intervals are the 
closed intervals 

h = [0,1/4], h = [1/4,1/2], I 3 = [1/2,3/4], I 4 = [3/4,1]. 

The second generation quartic intervals are obtained by sub-dividing each 
interval of the first generation by 4. Hence there are 16 = 4 2 quartic in¬ 
tervals of the second generation. In general, there are 4 k quartic intervals 
of the k th generation, each of the form where £ is integral with 

0 < £ < 4 气 

A chain of quartic intervals is a decreasing sequence of intervals 
I 1 D I 2 D • - D I k D •-, 

where I k is a quartic interval of the A: th generation (hence \I k \ = 4~ k ). 

Proposition 3.3 Chains of quartic intervals satisfy the following prop¬ 
erties: 

(i) If {I k } is a chain of quartic intervals, then there exists a unique 
t G [0,1] such that t G f] k I k . 

(ii) Conversely, given t G [0,1], there is a chain {I k } of quartic inter¬ 
vals such that t G f] k I k . 

(iii) The set oft for which the chain in part (ii) is not unique is a set 
of measure zero (in fact, this set is countable). 

Proof. Part (i) follows from the fact that {I k } is a decreasing sequence 
of compact sets whose diameters go to 0. 

For part (ii), we fix t and note that for each k there exists at least one 
quartic interval I k with t E ： I k . If t is of the form €/4 fc , where 0 < £ < 4 fc , 
then there are exactly two quartic intervals of the fc th generation that 
contain t. Hence, the set of points for which the chain is not unique is 
precisely the set of dyadic rationals 

£ 

-r：, where 1 < k, and 0 < £ < 4 k . 


Note that of course, these fractions are the same as those of the form 
/2 k， with 0 < £’ < 2 k ’. This set is countable, hence has measure 0. 
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It is clear that each chain {/ of quartic intervals can be represented 
naturally by a string • • • ak …, where each is either 0,1 ， 2, or 3. 
Then the point t corresponding to this chain is given by 


oo 

E 

k=l 


4fc* 


The points where ambiguity occurs are precisely those where = 3 for 
all sufficiently large k, or equivalently where ak = 0 for all sufficiently 
large k. 

Part of our description of the Peano mapping will follow from associ¬ 
ating to each quartic interval a dyadic square. These dyadic squares 
are obtained by sub-dividing the unit square [0,1] x [0,1] in the plane by 
successively bisecting the sides. 

For instance, dyadic squares of the first generation arise from bisecting 
the sides of the unit square. This yields four closed squares Si, ^ 2 , Ss 
and each of side length 1/2 and area \Si\ = 1/4, for i = 1,... ,4. 

The dyadic squares of the second generation are obtained by bisecting 
each dyadic square of the first generation, and so on. In general, there 
are 4 fc squares of the k th generation, each of side length l/2 k and area 
l/4 fc . 

A chain of dyadic squares is a decreasing sequence of squares 
S 1 D S 2 D - - D S k D --, 
where S k is a dyadic square of the fc th generation. 

Proposition 3.4 Chains of dyadic squares have the following proper¬ 
ties: 

(i) If { 炉 } is a chain of dyadic squares, then there exists a unique 
x G [0,1] x [0,1] such that x G P| fc S k . 

(ii) Conversely, given x G [0,1] x [0,1], there is a chain {5 fc } of dyadic 
squares such that x G f] k S k • 

(iii) The set of x for which the chain in part (ii) is not unique is a set 
of measure zero. 

In this case, the set of ambiguities consists of all points (xi, X 2 ) where 
one of the coordinates is a dyadic rational. Geometrically, this set is 
the (countable) union of vertical and horizontal segments in [0,1] x [0,1] 
determined by the grid of dyadic rationals. This set has measure zero. 
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if bk = 0 , 
if bk = 1 , 
if bk = 2 , 
if bk = 3. 


Moreover, each chain of dyadic squares can be represented by a string 
. 6162 …， where each bk is either 0,1, 2 or 3. Then 

k=l 

where 


3.2 Dyadic correspondence 

A dyadic correspondence is a mapping $ from quartic intervals to 
dyadic squares that satisfies: 

( 1 ) $ is bijective. 

( 2 ) $ respects generations. 

(3) $ respects inclusion. 

By (2), we mean that if / is a quartic interval of the k th generation, then 
$(/) is a dyadic square of the A: th generation. By (3), we mean that if 
/C J, then $(/) C $(J). 

For example, the trivial, or standard correspondence assigns to the 
string .a\a 2 - - - the string .6162 ••- with 6 ^ = a^. 

Given a dyadic correspondence the induced mapping maps 
[0,1] to [0,1] x [0,1] and is given as follows. If {t} =「| where {I k } 
is a chain of quartic intervals, then, since {^(I k )} is a chain of dyadic 
squares, we may let 

We note that is well-defined except on a (countable) set of measure 
zero, (those points t that are represented by more than one quartic chain.) 

A moment’s reflection will show that if V is a quartic interval of the 
fc th generation, then the images $*(/’) = tE /’}, comprise the 

dyadic square of the fc th generation 少 (/’). Thus $*(/’) = 少 (/’)，and 
hence mi(J’）= 

Theorem 3.5 Given a dyadic correspondence 电 , there exist sets Z\ C 
[0,1] and Z 2 C [0,1] x [0, 1], each of measure zero, so that: 


0101 

o o 1, 1, 

' - \ /— \ ' — \ ' — \ 


-TO 
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(i) is a bijection on [0,1] — Z\ to [0,1] x [0,1] — Z 2 . 

(ii) E is measurable if and only if ^(E) is measurable. 

(iii) mi(E) = m 2 ($*(■£)). 

Proof. First, let A/i denote the collection of chains of those quartic 
intervals arising in (iii) of Proposition 3.3, those for which the points in 
I = [0,1] are not uniquely representable. Similarly, let A /2 denote the 
collection of chains of those dyadic squares for which the corresponding 
points in the square I x I are not uniquely representable. 

Since $ is a bijection from chains of quartic intervals to chains of dyadic 
squares, it is also a bijection from A/i U $ _ 1 (A/* 2 ) to <E>(A/i) U A/ 2 , and 
hence also of their complements. Let Z\ be the subset of I consisting of 
all points in I that can be represented (according to (i) of Proposition 3.3) 
by the chains in A/i U $ - 1 (A/ 2 ), and let Z 2 be the set of points in the 
square that can be represented by dyadic squares in $(A/i) U A/ 2 . Then 
$*, the induced mapping, is well-defined on / — Zi, and gives a bijection 
oi I — Zi to (/ x /) — Z 2 . To prove that both Z\ and Z 2 have measure 
zero, we invoke the following lemma. We suppose {fk}^=i is a fixed given 
sequence, with each either 0,1,2, or 3. 

Lemma 3.6 Let 


00 

E 0 ^{x^J 2 akj 必 , where ak ★ fk for all sufficiently large k}. 
k=l 

Then m(Eo) = 0. 

Indeed, if we fix r, then m({x : a r ^ f r }) = 3/4, and 

m({x : a r ^ f r and a r +i 7 ^ /r+i}) = (3/4) 2 , etc. 

Thus m({x : (ik ★ fk, all k > r}) = 0, and Eq is a countable union of 
such sets, from which the lemma follows. 

There is a similar statement for points in the square S = I x I in terms 
of the representation ( 1 ). 

Note that as a result the set of points in / corresponding to chains in 
Afi form a set of measure zero. In fact, we may use the lemma for the 
sequence for which fk = 1 ， for all fc, since the elements of A/i correspond 
to sequences {a^} with = 0 for all sufficiently large k, or a/c = 3 for 
all sufficiently large k. 

Similarly, the points in the square S corresponding to A /2 form a set of 
measure zero. To see this, take for example 九 =1 for k odd, and 九 = 2 
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for k even, and note that A /2 corresponds to all sequences {ak} where 
one of the following four exclusive alternatives holds for all sufficiently 
large k: either ak is 0 or 1; or ak is 2 or 3; or ak is 0 or 2; or ak is 1 
or 3. By similar reasoning the points and $(A/i) form sets of 

measure zero in I and I x I respectively. 

We now turn to the proof that (which is a bijection from I — Z\ 
to (/ x /) — Z 2 ) is measure preserving. For this it is useful to recall 
Theorem 1.4 in Chapter 1, whereby any open set O in the unit interval 
I can be realized as a countable union IJ 二 1 Ij, where each Ij is a closed 
interval and the Ij have disjoint interiors. Moreover, an examination of 
the proof shows that the intervals can be taken to be dyadic, that is, of the 
form [£/2 J , (£ + 1)/2 J ], for appropriate integers £ and j. Further, such an 
interval is itself a quartic interval if j is even, j = 2fc, or the union of two 
quartic intervals [( 2 t)/ 2 2k , {21 + l)/2 2fc ] and [( 2 £ + l)/2 2/c , {21 + 2)/2 2fc ], 
if j is odd, j = 2k — 1. Thus any open set in / can be given as a union of 
quartic intervals whose interiors are disjoint. Similarly, any open set in 
the square / x / is a union of dyadic squares whose interiors are disjoint. 

Now let E be any set of measure zero m I — Zi and e > 0. Then we 
can cover E C |Jj Ij, where Ij are quartic intervals and ^2 - mi(Ij) < e. 
Because $*( 五 ） C |Jj 屯 *( 心 )， then 

⑽ g ⑹) < e. 

Thus is measurable and 1712 (^*(E)) = 0. Similarly, ($*) — 1 maps 

sets of measure zero in (/ x /) — Z 2 to sets of measure zero in I. 

Now the argument above also shows that if O is any open set in /, 
then $*((9 — Z\) is measurable, and m ， 2(^*(0 — Z\)) = mi{0). Thus 
this identity goes over to Gs sets in I. Since any measurable set differs 
from a Gs set by a set of measure zero, we see that we have established 
that m 2 (^(E)) = mi(E) for any measurable subset of E of I — Z\. The 
same argument can be applied to and this completes the proof 

of the theorem. 

The Peano mapping will be obtained as for a special correspon¬ 
dence 

3.3 Construction of the Peano mapping 

The particular dyadic correspondence we now present provides us with 
the steps to follow when tracing the approximations of the Peano curve. 
The main idea behind its construction is that as we go from one quartic 
interval in the k th generation to the next quartic interval in the same 
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generation, we move from a dyadic square of the k th generation to another 
square of the A: th generation that shares a common side. 

More precisely, we say that two quartic intervals in the same generation 
are adjacent if they share a point in common. Also, two squares in the 
same generation are adjacent if they share a side in common. 

Lemma 3.7 There is a unique dyadic correspondence $ so that: 

(i) If I and J are two adjacent intervals of the same generation, then 
$(/) and $( J) are two adjacent squares (of the same generation)• 

(ii) In generation k, if I— is the left-most interval and the right¬ 
most interval, then $(/_) is the left-lower square and $(/+) is the 
right-lower square. 

Part (ii) of the lemma is illustrated in Figure 9. 



Given a square S and its four immediate sub-squares, an acceptable 
traverse is an ordering of the sub-squares *Si, S2, S3, and 1S4, so that 
Sj and Sj-\-i are adjacent for j = 1 ， 2,3. With such an ordering, we note 
that if we color Si white, and then alternate black and white, the square 
Ss is also white, while S 2 and S 4 are black. The important point to 
remember is that if the first square in a traverse is white, then the last 
square is black. 

The key observation is the following. Suppose we are given a square 
S, and a side a of S. If Si is any of the immediate four sub-squares in 
S, then there exists a unique traverse 5i, S2, -S 3 , and S4 so that the last 
square S 4 has a side in common with a. With the initial square Si in 
the lower-left corner of iS, the four possibilities which correspond to the 
four choices of cr, are illustrated in Figure 10 . 

We may now begin the inductive description of the dyadic correspon¬ 
dence satisfying the conditions in the lemma. On quart ic intervals of the 
first generation we assign the square Sj = 伞 (/]•), as pictured in Figure 11 . 
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Figure 11. Initial step of the correspondence 


Now suppose $ has been defined for all quartic intervals of generation 
less than or equal to k. We now write the intervals in generation k in 
increasing order as /i,..., / 4 fc, and let Sj = 屯 (Ij). We then divide I\ 
into four quartic intervals of generation fc + 1 and denote them by /i,i, 
/i 2 , /i, 3 , and /i ? 4 , where the intervals are chosen in increasing order. 

Then, we assign to each interval Iij a dyadic square = Sj of 

generation fc + 1 contained in Si so that: 

(a) Si^i is the lower-left sub-square of 5i, 


(b) >Si ，4 touches the side that shares with S 2 , 

(c) 5 i 5 i, *Si, 2 , Si, 3 , and *Si ,4 is a traverse. 

This is possible, since the induction hypothesis guarantees that S 2 is 
adjacent to Si ， 
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This settles the assignments for the sub-squares of 5i, so we now turn 
our attention to S 2 . Let l 2 ,i, / 2 , 2 , ^ 2 , 3 , and / 2,4 denote the quartic 
intervals of generation fc + 1 in / 2 , written in increasing order. First, we 
take 52,1 = ^(-^ 2 , 1 ) to be the sub-square of S 2 which is adjacent to 5 i ， 4 . 
This can be done because Si : 4 touches S 2 by construction. Note that 
we leave Si from a black square (*?i, 4 ), and enter ^2 in a white square 
( 52 ,i). Since S 3 is adjacent to S 2 , we may now find a traverse S^i, *^ 2 , 2 , 
^ 2,3 and ^ 2,4 so that 5^,4 touches Ss ， 

We may then repeat this process in each interval Ij and square Sj, 
j = 3, … ， 4 fc . Note that at each stage the square Sj,i (the “entering” 
square) is white, while Sj^ (the “exiting” square) is black. 

In the final step, the induction hypothesis guarantees that S^k is the 
lower-right corner square. Moreover, since S^k_i must be adjacent to 
S^k it must be either above it, or to the left of it, so we enter a square of 
the (k + l) st generation along an upper or left side. The entering square 
is a white square, and we traverse to the lower right corner sub-square 
of *S 4 fc, which is a black square. 

This concludes the inductive step, hence the proof of Lemma 3.7. 

We may now begin the actual description of the Peano curve. For each 
generation k we construct a polygonal line which consists of vertical and 
horizontal line segments connecting the centers of consecutive squares. 
More precisely, let $ denote the dyadic correspondence in Lemma 3.7, 
and let , … ， S^k be the squares of the k th generation ordered according 
to that is, 屯 (Ij) = Sj. Let tj denote the middle point of Ij ， 

7-1 

^ = for j = l,...,4 fe . 

Let Xj be the center of the square Sj, and define 

k (j^j ) = • 

Also set 

P/c ⑼ =( 0 , l/ 2 fc+1 ) = xo where to = 0 , 

and 

朽⑴ =( 1 ， l/2 k+1 ) = x 4 fe+i where t 4 k +1 = 1 . 

Then, we extend Vk(t) to the unit interval 0 < t < 1 by linearity along 
the sub-intervals determined by the division points t 。， … ， 尤 4 fc +i. 
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Note that the distance \xj — = l/2 fc , while \tj — tj + i\ = l/4 k for 

0 < j < 4 fc . Also 

\ X 1 ~ x o\ = \ x 4 k ~ $ 4 fc + 1 | = 2 — Qk ? 

while 

1^1 — ^o| = |^4 fe " 亡 4 fc + 1 | = 2.4 /c. 

Therefore V r k (t) = 4 k 2~ k = 2 k except when t = tj. 

As a result, 

\Pk{i) ~ ^(^)1 2 k \t — s\. 

However, 

\V k+1 {t)-V k {t)\ < V22-\ 

because when ^/4 fc <t<(£-\- l)/4 fc , then Vk-\-i{t) and Vk{t) belong to 
the same dyadic square of generation k. 

Therefore the limit 

oo 

m = fe lim = V 1 (t) + ^2r j+1 (t)~ V^t) 

― 00 3=1 

exists, and defines a continuous function in view of the uniform conver¬ 
gence. By Lemma 2.8 we conclude that 

\V{t)-V{s)\<M\t-s\^ 2 , 

and V satisfies a Lipschitz condition of exponent of 1/2. 

Moreover, each Vk(t) visits each dyadic square of generation k as t 
ranges in [0,1]. Hence V is dense in the unit square, and by continuity 
we find that 1 1 —> V{t) is a surjection. 

Finally, to prove the measure preserving property of it suffices to 
establish V = $*. 

Lemma 3.8 If 蚤 is the dyadic correspondence in Lemma 3.7, then $*(t)= 
V(t) for every 0 <t < 1. 


Proof. First, we observe that $*(t) is unambiguously defined for 
every t. Indeed, suppose t G I k and t G「| fc J k are two chains of 
quartic intervals; then I k and J k must be adjacent for sufficiently large 




360 


Chapter 7. HAUSDORFF MEASURE AND FRACTALS 


k. Thus 屯 (I k ) and $(J fc ) must be adjacent squares for all sufficiently 
large k. Hence 

(户) 二 

k k 

Next, directly from our construction we have 

p|$(J fe )=limP fe (t)=P(t). 

k 

This gives the desired conclusion. 

The argument also shows that V(I) = $(/) for any quartic interval I. 
Now recall that any interval (a, b) can be written as (Jj Ij, where the Ij 
are quartic intervals with disjoint interiors. Because V(Ij) = these 

are then dyadic squares with disjoint interiors. Since P(a, b) = (J^. V(Ij), 
we have 

oo oo oo 

m 2 (V(a,b ))= mi(Ij) = mi (a, b). 

j=i j=i j'=i 

This proves conclusion (iii) of Theorem 3.1. The other conclusions hav¬ 
ing already been established, we need only note that the corollary is 
contained in Theorem 3.5. 

As a result, we conclude that t V(t) also induces a measure pre¬ 
serving mapping from [0,1] to [0,1] x [0,1]. This concludes the proof of 
Theorem 3.1. 


4* Besicovitch sets and regularity 

We begin by presenting a surprising regularity property enjoyed by all 
measurable subsets (of finite measure) of when d> 3. As we shall 
see, the fact that the corresponding phenomenon does not hold for d = 
2 is due to the existence of a remarkable set that was discovered by 
Besicovitch. A construction of a set of this kind will be detailed in 
Section 4.4. 

We first fix some notation. For each unit vector 7 on the sphere, 
7 G 5 d_1 , and each t G M we consider the plane 巧， 7, which is defined 
as the (d — 1 )-dimensional affine hyper plane perpendicular to 7 and of 
“signed distance” t from the origin . 1 The plane P t , 7 is given by 

Vt^ = {x G : x • = t}. 


1 Note that there are two planes perpendicular to 7 and of distance |t| from the origin; 
this accounts for the fact that t may be either positive or negative. 
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We observe that each Vt^ carries a natural (d — 1) Lebesgue measure, 
denoted by m^-i. In fact, if we complete 7 to an orthonormal basis 
ei, e 2 ,..., e^- 1,7 of then we can write any x G in terms of the 
corresponding coordinates as x = x\e\ + + … + Xd^y. When we set 

x G = ]R d — 1 x M with (xi, ... ,a^_i) G M d_1 , Xd G M, then the mea¬ 
sure rrid-i on Vt^ is the Lebesgue measure on M d_1 . This definition of 
rrid-i is independent of the choice of orthonormal vectors ei, e 2 ,, e^-i, 
since Lebesgue measure is invariant under rotations. (See Problem 4, 
Chapter 2, or Exercise 26, Chapter 3.) 

With these preliminaries out of the way, we define for each subset 
E cR d the slice of E cut out by the plane Vt^ as 


E t ^ = E nv t , 7 . 

We now consider the slices E t ^ as t varies, where E is measurable and 
7 is fixed. (See Figure 12.) 



We observe that for almost every t the set Et^ is rrid-i measurable 
and, moreover, md-i (Et, 7 ) is a measurable function of t. This is a 
direct consequence of Fubini’s theorem and the above decomposition, 
]R d = M d_1 x M. In fact, so long as the direction 7 is pre-assigned, not 
much more can be said in general about the function 1 1 —> md-i(E tn ). 
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However, when d > 3 the nature of the function is dramatically different 
for “most” 7 . This is contained in the following theorem. 

Theorem 4.1 Suppose E is of finite measure in M d ，with d > 3. Then 
for almost every 7 G S d_1 : 

(i) E trf is measurable for all t G M. 

(ii) rrid-i(Et^) is continuous m t G M. 

Moreover, the function of t defined by 7 ) = rrid-i(Et^) satisfies a 
Lipschitz condition with exponent a for any a with 0 < a < 1/2. 

The almost everywhere assertion is with respect to the natural measure 
da on 1 that arises in the polar coordinate formula in Section 3.2 of 
the previous chapter. 

We recall that a function / is Lipschitz with exponent a if 
\f(ti) - f(t 2 )\ < A\t! - t 2 \ a for some A. 

A significant part of (i) is that for a.e. 7 , the slice Et, 7 is measurable 
for all values of the parameter t. In particular, one has the following. 

Corollary 4.2 Suppose E is a set of measure zero in with d > 3. 
Then, for almost every 7 G S d_1 , the slice E trf has zero measure for all 
t eR. 

The fact that there is no analogue of this when d = 2 is a consequence of 
the existence of a Besicovitch set, (also called a “Kakeya set”），which is 
defined as a set that satisfies the three conditions in the theorem below. 

Theorem 4.3 There exists a set B m R 2 that: 

(i) is compact, 

(ii) has Lebesgue measure zero, 

(iii) contains a translate of every unit line segment. 

Note that with F = B and 7 G 5 1 one has mi(F D *p£ 0 , 7 ) > 1 for some t 0 . 
If mi(F D Vt,j) were continuous in then this measure would be strictly 
positive for an interval in t containing to, and thus we would have 
rri 2 {F) > 0, by Fubini’s theorem. This contradiction shows that the ana¬ 
logue of Theorem 4.1 cannot hold for d = 2. 

While the set B has zero two-dimensional measure, this assertion can¬ 
not be improved by replacing this measure by a-dimensional Hausdorff 
measure, with a < 2. 

Theorem 4.4 Suppose F is any set that satisfies the conclusions (i) 
and (iii) of Theorem 4.3. Then F has Hausdorff dimension 2. 
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4.1 The Radon transform 

Theorems 4.1 and 4.4 will be derived by an analysis of the regularity 
properties of the Radon transform TZ. The operator TZ arises in a number 
of problems in analysis, and was already considered in Chapter 6 of 
Book I. 

For an appropriate function / on ]R d , the Radon transform of / is 
defined by 

/ f. 

The integration is performed over the plane Pt , 7 with respect to the 
measure rrid-i discussed above. We first make the following simple ob¬ 
servation: 

1. If / is continuous and has compact support, then / is of course 

integrable on every plane Vt^, and so is defined for all 

(t, 7 ) G M x 5^ _1 . Moreover it is a continuous function of the pair 
(t, 7 ) and has compact support in the 亡 -variable. 

2. If / is merely Lebesgue integrable, then / may fail to be measurable 

or integrable on Vt n for some (t, 7 ), and thus is not 

defined for those (t, 7 ). 

3. Suppose / is the characteristic function of the set that is, / = 
Xe. Then TZ[f)(h) = rrid-i(E t ^) if E tn is measurable. 

It is this last property that links the Radon transform to our problem. 
Key estimates in this conclusion involve a maximal “Radon transform” 
defined by 

兄 *(/)( 7 ) = sup |^(/)(i, 7 )I, 


as well as corresponding expressions controlling the Lipschitz character 
of TZ(f)(t,j) as a function of t. A basic fact inherent in our analysis 
is that the regularity of the Radon transform actually improves as the 
dimension of the underlying space increases. 

Theorem 4.5 Suppose f is continuous and has compact support in 
with d> 3. Then 

( 2 ) f 兄 *(/)( 7 ) 如 ( 7 ) S c [||/|| L 1 ( 舻 )+ ||/|| i2(Rd) ] 


for some constant c > 0 that does not depend on f. 
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An inequality of this type is a typical “a priori” estimate. It is obtained 
first under some regularity assumption on the function /, and then a 
limiting argument allows one to pass to the more general case when / 
belongs to L 1 fl L 2 . 

We make some comments about the appearance of both the L 1 -norm 
and L 2 -norm in (2). The L 2 -norm imposes a crucial local control of 
the kind that is necessary for the desired regularity. (See Exercise 27.) 
However, without some restriction on / of a global nature, the function 
/ might fail to be integrable on any plane 7 , as the example f(x) = 
1/(1 + \x\ d_1 ) shows. Note that this function belongs to L 2 (R d ) if d > 3, 
but not to L 1 (M d ). 

The proof of Theorem 4.5 actually gives an essentially stronger result, 
which we state as a corollary. 


Corollary 4.6 Suppose f is continuous and has compact support in 
M. d , d> 3. Then for any a, 0 < a < 1/2, the inequality (2) holds with 
穴 *(/)(7) replaced by 


(3) 


sup 

tl^t2 


I 兄 (/)( ， i ， 7) - 叫 /)( ， 2,7)l 
\ti~t 2 \ a 


The proof of the theorem relies on the interplay between the Radon 
transform and the Fourier transform. 

For fixed 7 G 5 d_1 , we let 7 ^(/)(A, 7 ) denote the Fourier transform of 
in the t-variable 

J —OO 

In particular, we use A G M to denote the dual variable of t. 

We also write / for the Fourier transform of / as a function on 
namely 

/( 0 = / me-Sdx. 

JR d 

Lemma 4.7 If f is continuous with compact support, then for every 
7 G S d ~ x we have 

负 (/)(A ， 7) 二 /(A7) - 

The right-hand side is just the Fourier transform of f evaluated at the 
point 入 7 . 
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Proof. For each unit vector 7 we use the adapted coordinate system 
described above: x = (xi, … , Xd) where 7 coincides with the Xd direc¬ 
tion. We can then write each x G as x = (w, t) with u G M d_1 , t G M, 
where x ■ ^ = t = Xd and u = (xi, ..., Xd-i)> Moreover 

[f(u,t)du, 

JRd-1 



and Fubini’s theorem shows that f(x) dx : 
plying this to f(x)e~ 27rlx '^ x ^ in place of /(x) gives 


dt. Ap- 


/( 入 7) 


)e -2^-(A 7 ) dx : 


/R d 






e_ 2lTiXt dt 


-27ri\t 


dt. 




Therefore /(A7) = 7), and the lemma is proved. 


Lemma 4.8 If f is continuous with compact support, then 



W)(A,7)r|A| d - i dA ^(7) 


\f(x)\ 2 dx. 


Let us observe the crucial point that the greater the dimension d, the 
larger the factor | 入 | d_1 as | 入 | tends to infinity. Hence the greater the 
dimension, the better the decay of the Fourier transform 7^(/)(A, 7 ), 
and so the better the regularity of the Radon transform as a 

function of t. 


Proof. The Plancherel formula in Chapter 5 guarantees that 

2 ( \f{x)\ 2 dx^2 f \f(0\ 2 d^ 

JR d JR d 

Changing to polar coordinates C = 入 7 where A > 0 and 7 G 5 d_1 , we 
obtain 

2 [ |/(e)| 2 炎 = 2 / /° 0 |/(A 7 )| 2 A d - 1 dAda( 7 ). 

jR d JS ^ 1 JO 

We now observe that a simple change of variables provides 



|/(A 7 )| 2 A d_1 dA dcr{i) 



|/(A7)| 2 |A| d_1 dAdcr(7), 
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and the proof is complete once we invoke the result of Lemma 4.7. 

The final ingredient in the proof of Theorem 4.5 consists of the follow¬ 
ing ： 


Lemma 4.9 Suppose 


m 二 



F(X)e 2niXt dX, 


where 


sup |-F(A)| < A and 

入 G 肢 



|F(A)| 2 |A| d - 1 dA < B 2 . 


Then 


(4) sup \F(t)\ < c(A + B). 


Moreover, if 0 < a < 1/2, then 

(5) |F(ti) — F{t 2 )\ < c a \ti — t 2 \ a (A + B) for all ti ， t。. 

Proof. The first inequality is obtained by considering separately the 
two cases |A| < 1 and |A| > 1. We write 

F(t)= f F{\)e 2 ^ iXt d\+ [ F(X)e 2niXt dX. 

Clearly, the first integral is bounded by cA. To estimate the second inte¬ 
gral it suffices to bound /| A | >;L |-F(A)| dX. An application of the Cauchy- 
Schwarz inequality gives 

[ l^(A)|rfA <( [ p(A)| 2 |A 广 1 dA) / |A 「 朴 1 dA 

J|A|〉1 \J\X\>1 J V|A|〉1 



This last integral is convergent precisely when —d + 1 < —1, which is 
equivalent to d > 2, namely d > 3, which we assume. Hence \F(t)\ < 
c(A + B) as desired. 

To establish Lipschitz continuity, we first note that 


F{t 1 )-F(t 2 ) = 



户 (A) [e 


27riAti 


' 2niXt2 ] dX. 
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Since one has the inequality 2 \e lx — 1| < |x|, we immediately see that 


\^27riXti _ g27rzAt2 I 


< c\ti — t 2 \ a X a if 0 < a < 1. 


We may then write the difference F(ti) — F(^) as a sum of two inte¬ 
grals. The integral over |A| < 1 is clearly bounded by cA\ti — t 2 \ a . The 
second integral, the one over |A| 〉 1, can be estimated from above by 

[ |F(A)||ArdA. 

… 入 |>i 

An application of the Cauchy-Schwarz inequality show that this last in¬ 
tegral is majorized by 


1/2 


MA|>1 


dX 


,| 入 |>i 


|A|' 


-d+l+2a 


1/2 


d\ 


< c a B, 


since the second integral is finite if —d + 1 + 2a < — 1, and in particular 
this holds if a < 1/2 when d>3. This concludes the proof of the lemma. 


We now gather these results to prove the theorem. For each 7 G 5' 
let 

F(t) = 7Z(f)(t, 7 ). 

Note that with this definition we have 

sup|F ⑷卜 

Let 


d-l 


A(^) = sup |F(A)| and B 2 ^) 

入 


Then by (4) 


|F(A)| 2 |A| d —MA. 


sup|F(t)| < c(A( 7 ) + B( 7 )). 
teM 


However, we observed that F(A) = /( 入 7), and hence 

^(7) ^ II/IIl^r^)- 


2 The distance in the plane from the point e lx to the point 1 is shorter than the length 
of the arc on the unit circle joining them. 
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Therefore 


|^(/)( 7 )| 2 <^( 7 ) 2 + 5 ( 7 ) 2 ) 


and thus 



I 尺 *(/)(7)| 2 ^( 7 ) < c(ll/llii(K^) + \\f\\ 2 L 2 {Rd) ) 


since J B 2 (-f) da(^y) = 2||/||| 2 by Lemma 4.8. Consequently, 


^*(/)(7)^(7) < c(||/|| L i (R d) + \\f\\ L 2 (Rd) ). 


別一1 


Note that the identity we have used, 


兄 if)(h ) 二 



F(\)e 2niM dX 


with F(t) is justified for almost every 7 G <S d_1 by the 

Fourier inversion result in Theorem 4.2 of Chapter 2. Indeed, we have 
seen that A( 7 ) and * 6 ( 7 ) are finite for almost every 7 , and thus F is 
integrable for those 7 . This completes the proof of the theorem. The 
corollary follows the same way if we use (5) instead of (4). 

We now return to the situation in the plane to see what information 
we may deduce from the above analysis. The inequality (2) as it stands 
does not hold when d = 2. However, a modification of it does hold, and 
this will be used in the proof of Theorem 4.4. 


If / G L 1 (IR d ) we define 




f(x) dx. 


t—(5<cc-7<t+<5 


In this definition of TZs{f)(t^ 7 ) we integrate the function / in a small 
“strip” of width 25 around the plane Vt n . Thus TZs is an average of 
Radon transforms. 


We let 


兄 K/XtO = sup 1 2 〆/ 你 7 ) I. 


4*. Besicovitch sets and regularity 


369 


Theorem 4.10 If f is continuous with compact support, then 

尺 K/)(7) da(^r) < c(logl/(5) 1/2 (||/||li(m 2 ) + ll/lk 2 (M 2 )) 


， s 1 


when 0 < 5 < 1/2. 

The same argument as in the proof of Theorem 4.5 applies here, except 
that we need a modified version of Lemma 4.9. More precisely, let us set 


F S {t) 

and suppose that 


m 


'g27ri(t+<5 ) 入 _ e27ri(t—<5) 入、 


27ri\(2S) 


dX 


sup |-F(A)| < A and 

入 


|F(A)| 2 |A|dA < B. 


Then we claim that 

⑹ 


sup |F 5 (t)| < c(log l/5) 1/2 (A + B). 

t 


Indeed, we use the fact that |(sin x)/x\ < 1 to see that, in the definition 
of Fs(t), the integral over |A| < 1 gives the cA. Also, the integral over 
|A| > 1 can be split and is bounded by the sum 

[ \F(x)\d\+- [ in^HAr 1 ^. 

The first integral above can be estimated by the Cauchy-Schwarz in¬ 
equality, as follows 

[ l^(A)| d\<c( [ |F(A)| 2 |A| d\f 2 (f lAr 1 dAf 2 

Jl<|A|<l/(5 \Jl<|A|<l/5 / \Jl<lAj<l/S J 

< c_B(log 1/6) 1 / 2 . 

Finally, we also note that 

f [ I^HAr^A^cf [ |F(A)| 2 |A|dA 

0 Jl/5<\X\ \^1/(5<|A| J 

< cB 


\l/2- 


/ |Ar 3 dA 


\l/2 


and this establishes (6), and hence the theorem. 
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4.2 Regularity of sets when d> 3 

We now extend to the general context the basic estimates for the Radon 
transform, proved for continuous functions of compact support. This will 
yield the regularity result formulated in Theorem 4.1. 

Proposition 4.11 Suppose d> 3, and let f belong to L 1 (IR d ) fl L 2 (R d ). 
Then for a.e. 7 G we can assert the following: 

(a) f is measurable and integrable on the plane Vt^, for every t G M. 

(b) The function 7 ) is continuous in t and satisfies a Lips- 

chitz condition with exponent a for each a < 1/2. Moreover, the 
inequality (2) of Theorem 4-5 and its variant with (3) hold for f. 

We prove this in a series of steps. 

Step 1 . We consider / = xo, the characteristic function of a bounded 
open set O. Here the assertion (a) is evident since O fl Vt^ is an open 
and bounded set in Vt^ and is bounded. Thus 7^(/) (t, 7 ) is defined for 
all (t, 7 ). 

Next we can find a sequence {/ n } of non-negative continuous func¬ 
tions of compact support so that for every x, f n (x) increases to f(x) as 
n —• 00 . Thus 尺 (/n) (亡 , 7 ) —• 尺 (/)( 亡 ,7) for every (t, 7 ) by the monotone 
convergence theorem, and also 7?*(/n)(7) —• 穴 *(/)( 7 ) f Qr each 7 G S d-1 . 
As a result we see that the inequality (2) is valid for f = xo, with O 
open and bounded. 

Step 2. We now consider / = xe-, where E 1 is a set of measure zero, 
and take first the case when the set E is bounded. Then we can find a 
decreasing sequence {O n } of open and bounded sets, such that E C O n , 
while m(O n ) —> 0 as n —^ 00 . 

Let E 1 = p| O n . Since E fl is measurable for every (t, 7), the func- 
tions 7) and 7 ?* (% 左） (7^ are well-defined. However, 7 ?*(x 左 ） (7) < 

7 ^*(xo n )( 7 ), while the TZ*{xo n ) decrease. Thus the inequality (2) we 
have just proved for / = xo n shows that 尺 *(X 左 )(7) = 0 for a.e. 7. The 
fact that E C E then implies that for a.e. 7, the set E D Vt^ has (d — 1)- 
dimensional measure zero for every t G M. This conclusion immediately 
extends to the case when E is not necessarily bounded, by writing 五 as a 
countable union of bounded sets of measure zero. Therefore Corollary 4.2 
is proved. 

Step 3. Here we assume that / is a bounded measurable function 
supported on a bounded set. Then by familiar arguments we can find 
a sequence {/ n } of continuous functions that are uniformly bounded, 
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supported in a fixed compact set, and so that / n (x) —> f(x) a.e. By the 
bounded convergence theorem, ||/ n — f\\ L i and ||/ n — f\\ L 2 both tend to 
zero as n —> oo, and upon selecting a subsequence if necessary, we can 
suppose that ||/ n — f\\ L i + ||/ n — f\\ L 2 < 2 ~ n . By what we have just 
proved in Step 2 we have, for a.e. 7 G S^ _1 , that f n (x) —>• f(x) on Vt^ 
a.e. with respect to the measure md_i, for each t G M. Thus again by the 
bounded convergence theorem for those we see that 7 ^(/ n )(t, 7 ) —>• 

and this limit defines TZ[f). Now applying Theorem 4.5 to 
fn_ fn-i gives 



This means that 


^sup|^(/ n )(t, 7 ) -^(/ n _i)(f, 7 )| < 00 


for a.e .7 G *S d_1 ，and hence for those 7 the sequence of functions 尺 (/ n )(t 7 ) 
converges uniformly. As a consequence, for those 7 the function TZ(f)(t, 7 ) 
is continuous in and the inequality (2) is valid for this /. The inequality 
with (3) is deduced in the same way. 

Finally, we deal with the general / in L 1 D L 2 by approximating it by 
a sequence of bounded functions each with bounded support. The details 
of the argument are similar to the case treated above and are left to the 
reader. 

Observe that the special case f = \e of the proposition gives us The¬ 
orem 4.1. 

4.3 Besicovitch sets have dimension 2 

Here we prove Theorem 4.4, that any Besicovitch set necessarily has 
Hausdorff dimension 2. We use Theorem 4.10, namely, the inequality 



兄 K/)Cy) 如 (7) < c(log 1/6) 1/2 (||/||li(k 2 ) + lll/lk 2 (K=)) - 


This inequality was proved under the assumption that / was continuous 
and had compact support. In the present situation it goes over without 
difficulty to the general case where / G L 1 D L 2 , by an easy limiting 
argument, since it is clear that 尺 |(/n)(7) converges to 尺 $(/)(7) for all 
7 if / n —^ / in the Z/i-norm. 
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Now suppose F is a Besicovitch set and a is fixed with 0 < a < 2. 
Assume that F C |J i=1 Bi is a covering, where Bi are balls with diameter 
less than a given number. We must show that 

y^(diam Bi) a > c a > 0. 

i 

We proceed in two steps, considering first a simple situation that will 
make clear the idea of the proof. 


Case 1 . We suppose first that all the balls Bi have the same diameter 
5 (with 5 < 1/2) and also that there are only a finite number, say TV, of 
balls in the covering. We must prove that N5 a > c a . 

Let B* denote the double of Bi and F* = 1J^ B*. Then, we clearly 
have 


m(F*) < cN6 2 . 

Since F is a Besicovitch set, for each 7 G there is a segment 5 7 of 
unit length, perpendicular to 7 , and which is contained in F. Also, by 
construction, any translate by less than 5 of a point in s 7 must belong 
to F*. Hence 

^(xf*)( 7 ) > 1 for every 7 . 

If we take / = \f* in the inequality ( 6 ), and note that the Cauchy- 
Schwarz inequality implies 

IIxf^Hl^r 2 ) < c||xf*||l 2 (r 2 ) < c(m(F*)) 1 / 2 , 
then we obtain 

c < A/ rl / 2 5(log 1/5) 1 / 2 . 

This implies N5 a > c for a < 2. 

Case 2. We now treat the general case. Suppose F C IJSi 氏， where 
the balls Bi each have diameter less than 1. For each integer k let Nk be 
the number of balls in the collection {Bi} for which 

2- /c_1 < diam Bi < 2~ k . 

We need to show that Nk2~ ka > c a . In fact, we shall prove the 

stronger result that there exists a positive integer k f such that N^2~ k，a > 


^or 
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Let 

A = Ff|( U 氏 

\^2_ fc _i<diam Bi<2~ k 

and let 

F* k 二 |J B*, 

2 _fe_1 <diam Bi<2~ k 

where B* denotes the double of Bi. Then we note that 
mi(F fc *) < cN k 2~ 2k for all k. 

Since F is a Besicovitch set, for each 7 G 5 1 there is a segment s 7 of 
unit length entirely contained in F. We now make precise the fact that 
for some fc, a large proportion of 5 7 belongs to F^. 

We pick a sequence of real numbers { 0 ^}=。such that 0 < < 1, 

afc = 1, but ak does not tend to zero too quickly. For instance, we 
may choose = c e 2~ ek with c e = 1 — 2 _e , and e > 0 but e sufficiently 
small. 

Then, for some k we must have 

mi(5 7 n F k ) > a k . 

Otherwise, since F = |J we would have 

mi(s 7 D F) < a/c = 1, 

and this contradicts the fact that mi( 5 7 D_F) = 1 ， since 5 7 is entirely 
contained in F. 

Therefore, with this fc, we must have 

> a k , 

because any point of distance less than 2~ k from must belong to F^. 
Since the choice of k may depend on 7 , we let 

Ek 二 {l ' ^2-4XF fc *)(7) > a k }- 
By our previous observations, we have 



s 1 = |j 私 
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and so for at least one k, which we denote by k’, we have 


爪 (Ek，）k 2ixak', 

for otherwise m(Si) < 27r = 2n. As a result 

27T 27T CL]^ f ^k f 

< / dk f dcr(7) 

J E k , 

< [ 兄 * 2 — k '(XF:,)(1)M7)- 

J S± 

By the fundamental inequality (6) we get 

al < c(log2 fc ’)" 2 ||xF fc *,||L2(R ， 

Recalling that by our choice ak ~ 2 _e/c , and noting that Hx-F^JIl 2 < 
cN^! 2 2~ k \ we obtain 

2 (l-2e)fc’ 幺 c (l 0 g2 fc ， ) 1/2 Ar fc V 2 . 

Finally, this last inequality guarantees that N^2 ~ ak， > c a as long as 
4e < 2 — a. 

This concludes the proof of the theorem. 

4.4 Construction of a Besicovitch set 

There are a number of different constructions of Besicovitch sets. The one 
we have chosen to describe here involves the concept of self-replicating 
sets, an idea that permeates much of the discussion of this chapter. 

We consider the Cantor set of constant dissection Ci/ 2 ? which for sim¬ 
plicity we shall write as C, and which is defined in Exercise 3, Chapter 1. 
Note that C = 门匕 。^^， where Cq = [0,1], and Ck is the union of 2 fc 
closed intervals of length 4 _fc obtained by removing from C^-i the 2 fc_1 
centrally situated open intervals of length \ - 4 _/c+1 . The set C can also 
be represented as the set of points x G [0,1] of the form x = Yl'kLi e fc/4 fc , 
with 6k either 0 or 3. 

We now place a copy of C on the a;-axis of the plane R 2 = {(x, ?/)}, and a 
copy of \C on the line y = l. That is, we put Eq = {(x, y) : x ^ C, y = 0} 
and Ei = {(x,y) : 2x G C, y = 1}. The set F that will play the central 
role is defined as the union of all line segments that join a point of Eq 
with a point of Ei ， (See Figure 13.) 
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Figure 13. Several line segments joining Eq with E\ 


Theorem 4.12 The set F is compact and of two-dimensional measure 
zero. It contains a translate of any unit line segment whose slope is a 
number s that lies outside the intervals (—1, 2). 

Once the theorem is proved, our job is done. Indeed, a finite union of 
rotations of the set F contains unit segments of any slope, and that set 
is therefore a Besicovitch set. 

The proof of the required properties of the set F amounts to showing 
the following paradoxical facts about the set C + AC, for A 〉 0. Here 
C + AC = {xi + Xx2 : x\ G C, X 2 G C}: 

• C + AC has one-dimensional measure zero, for a.e. A. 

• C + is the interval [0,3/2]. 

Let us see how these two assertions imply the theorem. First, we note 
that the set F is closed (and hence compact), because both Eo and E\ 
are closed. Next observe that with 0 < y < 1， the slice F y of the set 
F is exactly (1 — y)C + |C. This set is obtained from the set C + AC, 
where A = y/(2(l — y)), by scaling with the factor 1 _ y. Hence F y is of 
measure zero whenever C + AC is also of measure zero. Moreover, under 
the mapping y i—> A, sets of measure zero in (0, oo) correspond to sets of 
measure zero in (0,1). (For this see, for example, Exercise 8 in Chapter 1, 
or Problem 1 in Chapter 6.) Therefore, the first assertion and Fubini’s 
theorem prove that the (two-dimensional) measure of F is zero. 

Finally the slope s of the segment joining the point (xq, 0), with the 
point (xi, 1) is s = l/(xi — xq). Thus the quantity s can be realized if 
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x\ E C/2 and xq G C, that is, if 1/s G C/2 — C. However, by an obvious 
symmetry C = 1 — C, and so the condition becomes 1/5GC/2 + C— 1 ， 
which by the second assertion is 1/s G [—1 ， 1/2]. This last is equivalent 
with s ^ (—1,2). 

Our task therefore remains the proof of the two assertions above. The 
proof of the second is nearly trivial. In fact, 


2 

3 



3 C+ 3 C ， 


and this set consists of all x of the form x = (警 + 警 where 

€k and e r k are independently 0 or 3. Since then 警 + 營 can take any 
of the values 0, 1, 2, or 3, we have that 誉 （C + \C) = [0,1], and hence 
C + |C = [0,3/2]. ' 一 


The proof that m{C + AC) = 0 for a.e. A 

We come to the main point: that C + AC has measure zero for almost all 
A. We show this by examining the self-replicating properties of the sets 
C and C + AC. 

We know that C = Ci UC 2 , where C\ and C 2 are two similar copies 
of C, obtained with similarity ratio 1/4, and given by C\ = \C and 
C 2 = + |. Thus Ci C [0,1/4] and C 2 C [3/4,1]. Iterating £ times this 

decomposition of C, that is, reaching the £ th “generation,” we can write 

⑺ u cj, 

l<j<2 £ 

with C[ = (1/4) £ C and each Cj a translate of C[. 

We consider in the same way the set 

/C(A) = C + AC, 

and we shall sometimes omit the A and write /C(A) = /C, when this causes 
no confusion. By its definition we have 

/C = /Ci U /C 2 U /C 3 U /C 4 , 

where /Ci = Ci + ACi, /C 2 = Ci + AC 2 , /C 3 = C 2 + ACi, and /C 4 = C 2 + A(^ 2 - 
An iteration of this decomposition using (7) gives 

⑻ /c= U 0 ， 

l<i<^ 
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C ) U (“会 


/C A/4 U /C(A/4) 


^ 3A 、 

C+ 

3A X 

+ T). 


/C(A) = C + AC = 


Thus /C(A) has measure zero if and only if /C(A/4) has measure zero. 
Hence it suffices to prove that /C(A) has measure zero for a.e. A G [1,4]. 

After these preliminaries let us observe that we immediately obtain 
that m(/C(A)) = 0 for some special A’s, those for which the following 
coincidence takes place: for some £ and a pair i and %' with i _ i’, 

/Cf(A) = 4(A). 

Indeed, if we have this coincidence, then (8) gives 

4 及 

m(/C(A)) < m(/Cf(A)) = (V — 1)4-V(/C(A ))， 

i=l, i 參 i ， 

and this implies m(/C(A)) = 0. 

The key insight below is that, in a quantitative sense, the A’s for which 
this coincidence takes place are “dense” relative to the size of £. More 
precisely, we have the following. 

Proposition 4.13 Suppose Aq and £ are given, with 1 < Aq < 4 and 
a positive integer. Then, there exist a X and a pair i, i' with i ♦ i! such 
that 

(9) fCi(X ) 二 K ： i(X) and |A- A 0 | < cA~ e . 

Here c is a constant independent of Xq and £. 

This is proved on the basis of the following observation. 

Lemma 4.14 For every Aq there is a pair 1 < ii,i 2 幺 4 ， with i\ ^ i 2 
such that /C^fAo) and ICi 2 (Xo) intersect. 


where each ICf equals + ACj 2 for a pair of indices ji, j 2 - In fact, 
this relation among the indices sets up a bijection between the i with 
1 < 2 < 4^, and the pair ji , 九 with I < ji < 2^ and 1 < J 2 幺 2( Note 
that each /Cf is a translate of /Cf, and each /Cf is also obtained from /C by 
a similarity of ratio 4 _ ' Now note that C = C/4(J(C/4 + 3/4) implies 
that 


A 14 
+ 
c 
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Proof. Indeed, if the /Q are disjoint for 1 < i < 4 then for sufficiently 
small 5 the /Cf are also disjoint. Here we have used the notation that F 5 
denotes the set of points of distance less than 5 from F. (See Lemma 3.1 
in Chapter 1.) However, JC S = U^=i ? and by similarity m(/C 4(5 )= 
4m(/Cf). Thus by the disjoint ness of the /Cf we have m(IC s ) = m(/C 4d )， 
which is a contradiction, since /C 4<5 — 1C 5 contains an open ball (of radius 
35/2). The lemma is therefore proved. 

Now apply the lemma for our given Ao and write /C^ = + XqC Ui , 

ICi 2 = + XoC U2 , where the fi’s and z/s are either 1 or 2. However, since 

i\ i 2 we have or # i /2 (or both). Assume for the moment 

that " 17 ^ 2 . 

The fact that /C^(Ao) and /Q 2 (Ao) intersect means that there are pairs 
of numbers (a, b) and [o!, 6’)，with a G , b G C Ul , o! G C M2 , and b r G C V2 
such that 

(10) a + = cl Aq6’. 

Note that the fact that v\ ^ V 2 means that \b — b f \ > 1/2. Next, look¬ 
ing at the £ th generation we find via (7) that there are indices 1 < 
so that a e C £ h C C Ml , 6 G C| 2 C a f e Cj, C C M2 , b r G 
Cj, C C U2 . We also observe that the above sets are translates of each 
other, that is, = C 1 -, + T\ and C》 2 = Cj, + T 2 , with \rk\ < 1. Hence if 
i and i’ correspond to the pairs (ji, j. 2 ) and [j[, j’ 2 ), respectively, we have 

(11) /Cf(A) = /C^/(A) + r(A) with r(A) =ri + At 2 . 

Now let (A, B) be the pair that corresponds to (a 7 , b r ) under the above 
translations, namely 

(12) A = o! -\- B = b’ 丁 2 . 

We claim there is a A such that 

(13) A + \B = a r ^ \b r . 

In fact, by (12) we have put B in Cj 2 C C Ul , while 6’ is in Cj, C C U2 . Thus 
\B — b r \ > 1/2, since v% We can therefore solve (13) by taking 
A = (A — a / )/(6 / — B). Now we compare this with (10), and get A 0 = 
(a — a / )/(6 / — 6). Moreover, \A — a\ < 4 - ^ and \B — b\ < 4 - ^, since A 
and a both lie in , and B and b lie in C^ 2 . This yields the inequality 


( 14 ) 


|A-A 0 | <c4- £ . 
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Also, (12) and (13) clearly imply r(A) = n + At ~2 = 0, and this together 
with (11) proves the coincidence. 

Therefore our proposition is proved under the restriction we made 
earlier that ui ♦ v% The situation where instead \i\ ^ is obtained 
from the case ^ if we replace A 0 by A^ 1 . Note that /Cf(A 0 )= 
/Cf(A 0 ) if and only if + A 0 C^ 2 = Cj, + A 0 Cj, and this is the same as 

Cj 2 + ^o 1 Cj l = Cj, + . This allows us to reduce to the case fii ^ 

^ 2 , since C and Cj, C C M2 . Here the fact that 1 < Ao < 4 gives 
Aq 1 < 1 and guarantees that the constant c in (9) can be taken to be 
independent of Ao. The proposition is therefore established. 

Note that as a consequence, the following holds near the points A where 
the coincidence (9) takes place: If |A — A| < e4 乂 then 

(15) /Cf(A) = /C^(A) + r(A) with |r(A)| < e4r £ . 

In fact, this is (11) together with the observation that 

|r(A)| = |r(A)-r(A)|<|A-A|, 

since |r(A)| = Ti + At 2 and |t 2 | < 1. 

The assertion (15) leads to the following more elaborate version of 
itself: 


There is a set A of full measure such that whenever A G A 
and e > 0 are given, there are i and a pair so that (15) 
holds. 3 

Indeed, for fixed e > 0, let A e denote the set of A that satisfies (15) for 
some i and i'. For any interval I of length not exceeding 1, we have 

m(A e D /) > e4~^ > c - 1 em ⑺, 

because of (9) and (15). Thus has no points of Lebesgue density, 
hence has measure zero, and thus A e is a set of full measure. (See 
Corollary 1.5 in Chapter 3.) Since A = P| e A e , and A e decreases with e, 
we see that A also has full measure and our assertion is proved. 

Finally, our theorem will be established once we show that m(/C(A))= 
0 whenever A G A. To prove this, we assume contrariwise that m(/C(A)) > 
0. Using again the point of density argument, there must be for any 


3 The terminology that A has “full measure” means that its complement has measure 
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0 < 5 < 1, a non-empty open interval I with m(/C(A) fl /) > 5m(I). We 
then fix 6 with 1/2 < <5 < 1 and proceed. With this fixed <5, we select 
e used below as e = m(/)(l — 5). Next, find i, and %' for which (15) 
holds. The existence of such indices is guaranteed by the hypothesis that 
A G A. 

We then consider the two similarities (of ratio 4 -勺 that map /C(A) to 
/Cf(A) and /Cf,(A), respectively. These take the interval I to correspond¬ 
ing intervals Ii and JV, respectively, with m(Ii) = m(/^) = 4 _£ m(/). 
Moreover, 

m(/Cf fl Ii) > Sm(Ij) and m(/Cf/ fl JV) > 

Also, as in (15), 心 =h~\- r(A), with |r(A)| < e4- £ . This shows that 
> m(Ii) - r(A) > 4 - ’m ⑺ - e4~ £ > Sm(Ii), 
since e4~ £ = (1 — S)m(Ii). Thus m(Ii — Ii C\ Ii，) < (1 — d)m(Ii), and 

m(/cf n 厶 n iv ) 之 m(/C- n Ii) - m(Ii - lid Ii/) 

> (2(5 - l)m(Ii) 

> > -m(IinU). 

So m(/Cf H Ii D Ii') > \m{Ii fl Ij/) and the same holds for i' in place of i. 
Hence m(/Cf fl K[,) > 0, and this contradicts the decomposition ( 8 ) and 
the fact that = 4_£ m(/C) for every i. Therefore we obtain that 

m(/C(A)) = 0 for every A G A, and the proof of Theorem 4.12 is now 
complete. 

5 Exercises 

1. Show that the measure m a is not cr-finite on R d if a < d. 

2. Suppose E\ and are two compact subsets of R d such that 五 i D E 2 contains 
at most one point. Show directly from the definition of the exterior measure that 
if 0 < a < d, and E = U E 2 , then 

m* a (E) = m* a (Ei) + mUE2). 

[Hint: Suppose _Ei Pi 五 2 = {$}, let B e denote the open ball centered at x and of 
diameter e, and let E e = E C\ B^. Show that 

ml(E^) > H^E) > ml(E) - M (e) - e«, 
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where "(e) —> 0. Hence m* a {E e ) —> m^E).] 

3. Prove that if / : [0,1] —> R satisfies a Lipschitz condition of exponent 7 > 1, 
then / is a constant. 

4. Suppose f : [0,1] —>• [0,1] x [0,1] is surjective and satisfies a Lipschitz condition 

1 / ⑻ - f(y)\ < Clx-yl 1 . 

Prove that 7 < 1/2 directly, without using Theorem 2.2. 

[Hint: Divide [0,1] into N intervals of equal length. The image of each sub-interval 
is contained in a ball of volume 0(N~ 2ry ), and the union of all these balls must 
cover the square.] 

5. Let f(x) = x k be defined on IR, where A; is a positive integer and let E be a, 
Borel subset of R. 

(a) Show that if m a (E) = 0 for some a, then m a (f(E)) = 0. 

(b) Prove that dim ( 五 ） =dim f(E). 


6 . Let {Ek} be a sequence of Borel sets in Show that if dimEk < a for some 
a and all k, then 


dim(J Ek < a. 
fc 


7. Prove that the (log 2 / log 3)-Hausdorff measure of the Cantor set is precisely 
equal to 1 . 

[Hint: Suppose we have a covering of C by finitely many closed intervals {/j}. 
Then there exists another covering of C by intervals {/^} each of length 3 _fc for 
some /c, such that \Ij\ a > > 1 , where a = log 2 /log3.] 

8 . Show that the Cantor set of constant dissection, Q, in Exercise 3 of Chapter 1 
has strict Hausdorff dimension log2/log(2/(l — f)). 

9. Consider the set x C^ 2 in IR 2 , with Q as in the previous exercise. Show that 
Qi x Q 2 has strict Hausdorff dimension (^ 111 ((^) + dim(Q 2 ). 

10. Construct a Cantor-like set (as in Exercise 4, Chapter 1) that has Lebesgue 
measure zero, yet Hausdorff dimension 1. 

[Hint: Choose 彳 1 , 彳 2 ,… ， €fc，... so that 1 — tends to zero sufficiently 

slowly as A: —>• 00 .] 


11. Let T> = T>^ be the Cantor dust in R 2 given as the product Q x C 乏 , with 

M = (1 — 0/ 2 . 
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(a) Show that for any real number A, the set + AQ is similar to the projection 
of V on the line in R 2 with slope 入 =tan0. 

(b) Note that among the Cantor sets Q, the value f = 1/2 is critical in the 
construction of the Besicovitch set in Section 4.4. In fact, prove that with 
f > 1/2, then Q + AQ has Lebesgue measure zero for every A. See also 
Problem 10 below. 

[Hint: m a (C^ + AQ) < oo for a = dim T>^.] 

12. Define a primitive one-dimensional “measure” mi as 

oo oo 

mi = inf ^ diam Ffc, E C [J 

k=l k=l 

This is akin to the one-dimensional exterior measure m*, a = 1, except that no 
restriction is placed on the size of the diameters Fk. 

Suppose 1 1 and I 2 are two disjoint unit segments in IR d , d > 2, with Ii = I 2 h, 
and \h\ < e. Then observe that rhi(Ii) = 7721 (/ 2 ) = 1, while mi(/i U / 2 ) < 1 + e. 
Thus 

rhi{I\ U I 2 ) < rh\{Ii) + mi(/ 2 ) when e < 1; 
hence rh\ fails to be additive. 

13. Consider the von Koch curve JC £ , 1/4 ： < £ < 1/2, as defined in Section 2.1. 
Prove for it the analogue of Theorem 2.7: the function 1 1 —> JC £ (t) satisfies a Lip- 
schitz condition of exponent 7 = log(l/^)/ log4. Moreover, show that the set JC e 
has strict Hausdorff dimension a = I/ 7 . 

[Hint: Show that if O is the shaded open triangle indicated in Figure 14, then O D 
5 0 (O)U U S 2 (C)U S 3 (0), where S 0 (x) = £x, Sxix) = p e (£x) + a, S 2 (x)= 

Pq 1 (£x) + c, and Ss(x) = £x b, with pe the rotation of angle 9. Note that the 
sets Sj(0) are disjoint.] 


c 



Figure 14. The open set O in Exercise 13 


14. Show that if ^ < 1/2, the von Koch curve 1 1 -^ JC £ (t) in Exercise 13 is a simple 


curve. 
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[Hint: Observe that if t = a j / ^ ? with aj = 0,1,2, or 3, then 


{my = n (… 〜 (5 ai (o))).] 

j=i 


15. Note that if we take £ = 1/2 in the definition of the von Koch curve in 
Exercise 13 we get a “space-filling” curve, one that fills the right triangle whose 
vertices are (0,0), (1,0), and (1/2,1/2). The first three steps of the construction 
are as in Figure 15, with the intervals traced out in the indicated order. 


2 个 

— > 

I 3 

— > 

2 

8 

7 

6 

3 5 

9 

10 

11 

12 14 

15 

1 

4 

1 

4 

13 

16 


Figure 15. The first three steps of the von Koch curve when 1—1/2 


16. Prove that the von Koch curve 1 1 —>• 1/4 ： < £ < 1/2 is continuous but 

nowhere differentiable. 

[Hint: If 1C'{t) exists for some then 

Um K,(u n ) — JC(v n ) 

n-^oo U n 一 V n 


must exist, where u n < t < v n , and u n — v n ^ 0. Choose u n = k/4： n and v n = 
(fc + l)/4'] 

IT. For a compact set E in define #(e) to be the least number of balls of 
radius e that cover E. Note that we always have #(e) = 0(e~ d ) as e ^ 0, and 
#(e) = 0(1) if is finite. 

One defines the covering dimension of E, denoted by dime 1 ( 五 ) ，as inf (3 such 
that #(e) = 0(e _/3 ), as e ^ 0. Show that dimc(E) = dimM ( 五 )， where dimM is the 
Minkowski dimension discussed in Section 2.1, by proving the following inequalities 
for all <5 > 0: 
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(i) m{E 5 ) < c#(6)8 d . 

(ii) 働 S c f m(E 5 )6~ d . 

[Hint: To prove (ii), use Lemma 1.2 in Chapter 3 to find a collection of disjoint 
balls Bi, B 2 , ..., Bn of radius 8/3, each centered at E, such that their “triples” 
Bi, B 2 , ... Bn (of radius 5 ) cover E. Then #(< 5 ) < N, while Nm(Bj) = cN8 d < 
m(E s ), since the balls Bj are disjoint and are contained in E 5 .] 

18. Let E be a, compact set in R d . 

(a) Prove that dim(^) < dirriM ( 五 )， where dim and dimM are the Hausdorff and 
Minkowski dimensions, respectively. 

(b) However, prove that if 五 ={0,1/ log 2,1/ log 3, … ， 1/ log n,...}, then 
dimM E = 1, yet dim 五 = 0. 

19. Show that there is a constant Cd, dependent only on the dimension d, such 
that whenever 五 is a compact set, 

m(E 26 ) < Cdm(E s ). 

[Hint: Consider the maximal function /*, with / = \e s ? an d take Cd = 6 d .] 


20. Show that if F is the self-similar set considered in Theorem 2.12, then it has 
the same Minkowski dimension as Hausdorff dimension. 

[Hint: Each Fk is the union of m k balls of radius cr k . In the converse direction one 
sees by Lemma 2.13 that if e = r k , then each ball of radius e can contain at most 
d vertices of the k th generation. So it takes at least m k /c f such balls to cover F.] 

21. From the unit interval, remove the second and fourth quarters (open intervals). 
Repeat this process in the remaining two closed intervals, and so on. Let F be the 
limiting set, so that 

00 

F = {x : x = afc/4 fe a/c = 0 or 2}. 

k=l 

Prove that 0 < < 00 . 

22. Suppose F is the self-similar set arising in Theorem 2.9. 

(a) Show that if m < l/r d , then rrid{Fi fl Fj) = 0 ii i ^ j. 

(b) However, if m > l/r d , prove that Fi fl Fj is not empty for some i 7 ^ j. 

(c) Prove that under the hypothesis of Theorem 2.12 

m a (Fi C] Fj) = 0, with a = logm/log(l/r), whenever i 7 ^ j. 
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23. Suppose Si,, S m are similarities with ratio r, 0 < r < 1. For each set E, 
let 

S(E) = Si(E)U-^USm(E), 


and suppose F denotes the unique non-empty compact set with S(F) = F. 

(a) If x G .F, show that the set of points {*§ n (^)}^Li is dense in F. 

(b) Show that F is homogeneous in the following sense: if xo £ F and B is 
any open ball centered at a：o, then FOB contains a set similar to F. 


24. Suppose E is a Borel subset of R d with dim 五 < 1. Prove that E is totally 
disconnected, that is, any two distinct points in E belong to different connected 
components. 

[Hint: Fix x,y ^ E, and show that f(t) = \t — x\ is Lipschitz of order 1, and hence 
dim f(E) < 1. Conclude that f(E) has a dense complement in R. Pick r in the 
complement of f(E) so that 0 < r < f(y), and use the fact that E = {t E ： E : 
\t — x\ < r} U {t E E : \t — x\ > r}.] 

25. Let F(t) be an arbitrary non-negative measurable function on R, and 7 G o _1 . 
Then there exists a measurable set E in IR d , such that F(t) = rrid-i{E D Vt。). 


26. Theorem 4.1 can be refined for > 4 as follows. 

Define C k,a to be the class of functions F(t) on M. that are C k and for which 
F ⑻ (t) satisfies a Lipschitz condition of exponent a. 

If E has finite measure, then for a.e. 7 G o _1 the function m(E D Vt,^) is in 
C k,a for k = (d — 3)/2, a < 1/2, if d is odd, > 3; and for, k = (d — 4)/2, a < 1, 
if d is even, d> 4. 

27. Show that the modification of the inequality (2) of Theorem 4.5 fails if we 
drop ||/|| L 2 ( Rt i) from the right-hand side. 

[Hint: Consider with f e defined by f e (x) = (|a;| + e) _d+<5 , for |x| < 1, with 

5 fixed, 0 < <5 < 1 , and e ^ 0 .] 

28. Construct a compact set E C M d , d> 3, such that rrid{E) = 0, yet E contains 
translates of any segment of unit length in R d . (While particular examples of such 
sets can be easily obtained from the case d = 2 , the determination of the least 
Hausdorff dimension among all such sets is an open problem.) 


6 Problems 


1. Carry out the construction below of two sets U and V so that 
dimt/ 二 dim y = 0 but dim(?7 x y) > 1. 
Let Ji, … ，J n ,... be given as follows: 
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• Each Ij is a finite sequence of consecutive positive integers; that is, for all j 


Ij = {n GN : Aj < n < Bj} for some given Aj and Bj. 


• For each j, Ij+\ is to the right of Ij] that is, Aj+i > Bj. 

Let U C [0,1] consist of all x which when written dyadically x = .a±a 2 • • • a n ••• 
have the property that a n = 0 whenever n G IJ^ Ij . Assume also that Aj and Bj 
tend to infinity (as j —>• oo) rapidly enough, say Bj /Aj oo and Aj^.\/Bj —>• oo. 
Also, let Jj be the complementary blocks of integers, that is, 


J j = {n E : Bj < n < Aj+i}. 


Let V C [0,1] consist of those x = .aia2 ■ • ■ •■- with a n = 0 if n G IJj Jj. 

Prove that U and V have the desired property. 

2* The iso-diametric inequality states the following: If 五 is a bounded subset of 
M. d and diam E = sup{|a: — y\ : x,y £ E}, then 



where Vd denotes the volume of the unit ball in R d . In other words, among sets of 
a given diameter, the ball has maximum volume. Clearly, it suffices to prove the 
inequality for E instead of E 1 , so we can assume that E is compact. 

(a) Prove the inequality in the special case when E is symmetric, that is, —x G E 
whenever x G E. 

In general, one reduces to the symmetric case by using a technique called Steiner 
symmetrization. If e is a unit vector in IR d , and P is a plane perpendicular to e, 
the Steiner symmetrization of E with respect to E is defined by 


S{E, e) = {x te : x V, \t\ < 


L[E\ e; x)} 


where L(E., e\x) = m ({t G M : a: + t • e G E}), and m denotes the Lebesgue mea¬ 
sure. Note that a: + te G S(E, e) if and only if x — te G S[E, e). 

(b) Prove that S(E,e) is a bounded measurable subset of IR d that satisfies 
m(S(E,e)) = m(E). 

[Hint: Use Fubini’s theorem.] 

(c) Show that diam S(E, e) < diam E. 


(d) If p is a rotation that leaves E and V invariant, show that pS(E, e)= 
S(E,e). 


(e) Finally, consider the standard basis {ei,..., e^} of R d . Let Eq = E, Ei = 
S(Eo, ei), E 2 = ^(^i, e 2 ), and so on. Use the fact that Ed is symmetric to 
prove the iso-diametric inequality. 



6. Problems 


387 


(f) Use the iso-diametric inequality to show that m(E) = ^rrid(E) for any 
Borel set E in R d . 


3. Suppose S' is a similarity. 

(a) Show that S maps a line segment to a line segment. 

(b) Show that if L\ and L 2 are two segments that make an angle a, then S(Li) 
and * 5 ( 1 / 2 ) make an angle a or —a. 

(c) Show that every similarity is a composition of a translation, a rotation 
(possibly improper), and a dilation. 

4. * The following gives a generalization of the construction of the Cantor-Lebesgue 
function. 

Let F be the compact set in Theorem 2.9 defined in terms of m similarities 
5i, 5*2,, Sm with ratio 0 < r < 1. There exists a unique Borel measure fi sup¬ 
ported on F such that ^(F) = 1 and 

m 

fi(E) = — fi(S~ 1 (E)) for any Borel set E. 

m j=i 

In the case when F is the Cantor set, the Cantor-Lebesgue function is /i([0, x]). 

5. Prove a theorem of Hausdorff: Any compact subset K of IR d is a continuous 
image of the Cantor set C. 

[Hint: Cover K by 2 ni (some ni) open balls of radius 1, say Bi ,(with 
possible repetitions). Let Kj 1 = K fl 尽 、 and cover each Kj x with 2 n2 balls of 
radius 1/2 to obtain compact sets Kj 1 j 2 , and so on. Express t £ C as a ternary 
expansion, and assign to t a unique point in K defined by the intersection Kj x fl 
Kj 1 j 2 fl ■•- for appropriate ji, J 2 , • To prove continuity, observe that if two 
points in the Cantor set are close, then their ternary expansions agree to high 
order.] 

6. A compact subset K of R d is uniformly locally connected if given e > 0 
there exists 5 > 0 so that whenever x,y E K and \x — y\ < 5, there is a continuous 
curve 7 in X joining x to y, such that 7 C B e (x) and 7 C B e (y). 

Using the previous problem, one can show that a compact subset K of is 
the continuous image of the unit interval [0,1] if and only if K is uniformly locally 
connected. 

7. Formulate and prove a generalization of Theorem 3.5 to the effect that once 
appropriate sets of measure zero are removed, there is a measure-preserving iso¬ 
morphism of the unit interval in R and the unit cube in 

8 . * There exists a simple continuous curve in the plane of positive two-dimensional 


measure. 
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9. Let 五 be a compact set in M d_1 . Show that dim (五 x I) = dim ( 五 ）+ 1, where 
I is the unit interval in R. 

10. * Let Q be the Cantor set considered in Exercises 8 and 11. If ^ < 1/2, then 
Q + 入 Q has positive Lebesgue measure for almost every A. 


Notes and References 


There are several excellent books that cover many of the subjects treated here. 
Among these texts are Riesz and Nagy [27], Wheeden and Zygmund [33], Fol- 
land [13], and Bruckner et al. [4]. 

Introduction 

The citation is a translation of a passage in a letter from Hermite to Stieltjes [18]. 

Chapter 1 

The citation is a translation from the French of a passage in [3]. 

We refer to Devlin [7] for more details about the axiom of choice, Hausdorff 
maximal principle, and well-ordering principle. 

See the expository paper of Gardner [14] for a survey of results regarding the 
Brunn-Minkowski inequality. 

Chapter 2 

The citation is a passage from the preface to the first edition of Lebesgue’s book 
on integration [20]. 

Devlin [7] contains a discussion of the continuum hypothesis. 

Chapter 3 

The citation is from Hardy and Littlewood’s paper [15]. 

Hardy and Littlewood proved Theorem 1.1 in the one-dimensional case by 
using the idea of rearrangements. The present form is due to Wiener. 

Our treatment of the isoperimetric inequality is based on Federer [11]. This 
work also contains significant generalizations and much additional material on 
geometric measure theory. 

A proof of the Besicovitch covering in the lemma in Problem 3* is in Mat- 
tila [22]. 

For an account of functions of bounded variations in R d , see Evans and 
Gariepy [8]. 

An outline of the proof of Problem 7 (b)* can be found at the end of Chapter 5 
in Book I. 

The result in part (b) of Problem 8* is a theorem of S. Saks, and its proof as 
a consequence of part (a) can be found in Stein [31]. 

Chapter 4 

The citation is translated from the introduction of PlancherePs article [25]. 

An account of the theory of almost periodic functions which is touched upon 
in Problem 2* can be found in Bohr [2]. 

The results in Problems 4* and 5* are in Zygmund [35], in Chapters V and VII, 
respectively. 

Consult Birkhoff and Rota [1] for more on Sturm-Liouville systems, Legendre 
polynomials, and Hermite functions. 

Chapter 5 
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See Courant [6] for an account of the Dirichlet principle and some of its applica¬ 
tions. The solution of the Dirichlet problem for general domains in R 2 and the 
related notion of logarithmic capacity of sets are treated in Ransford [26]. Fol- 
land [12] contains another solution to the Dirichlet problem (valid in R d , d> 2) 
by methods which do not use the Dirichlet principle. 

The result regarding the existence of the conformal mapping stated in Prob¬ 
lem 3* is in Chapter VII of Zygmund [35]. 

Chapter 6 

The citation is a translation from the German of a passage in C. Caratheodory [5]. 

Petersen [24] gives a systematic presentation of ergodic theory, including a 
proof of the theorem in Problem 7*. 

The facts about spherical harmonics needed in Problem 4* can be found in 
Chapter 4 in Stein and Weiss [32]. 

We refer to Hardy and Wright [16] for an introduction to continued fractions. 
Their connection to ergodic theory is discussed in Ryll-Nardzewski [28]. 

Chapter 7 

The citation is a translation from the German of a passage in Hausdorff’s arti¬ 
cle [17], while Mandelbrot’s citation is from his book [21]. 

Mandelbrot’s book also contains many interesting examples of fractals arising 
in a variety of different settings, including a discussion of Richardson’s work on 
the length of coastlines. (See in particular Chapter 5.) 

Falconer [10] gives a systematic treatment of fractals and Hausdorff dimension. 
We refer to Sagan [29] for further details on space-filling curves, including the 
construction of a curve arising in Problem 8*. 

The monograph of Falconer [10] also contains an alternate construction of the 
Besicovitch set, as well as the fact that such sets must necessarily have dimension 
two. The particular Besicovitch set described in the text appears in Kahane [19], 
but the fact that it has measure zero required further ideas which are contained, 
for instance, in Peres et al. [30]. 

Regularity of sets in M d , d > 3, and the estimates for the maximal function 
associated to the Radon transform are in Falconer [9], and Oberlin and Stein [23]. 

The theory of Besicovitch sets in higher dimensions, as well as a number of 
interesting related topics can be found in the survey of Wolff [34]. 
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Symbol Glossary 


The page numbers on the right indicate the first time the symbol or 
notation is defined or used. As usual, Z, Q, R, and C denote the integers, 
the rationals, the reals, and the complex numbers respectively. 


|糾 

(Euclidean) Norm of x 

2 

E c , E-F 

Complements and relative complements of 
sets 

2 

d(E,F) 

Distance between two sets 

2 

B r (x), B r (x) 

Open and closed balls 

2 

E, dE 

Closure and boundary of E, respectively 

3 

\R\ 

Volume of the rectangle R 

3 

0 (... )A 

0 notation 

12 

c, Q, c 

Cantor sets 

9, 38 

m*(E) 

Exterior (Lebesgue) measure of the set E 

10 

Ek / E, E k \ E 

Increasing and decreasing sequences of sets 

20 

EAF 

Symmetric difference of E and F 

21 

Eh = E h 

Translation by h of the set E 

22 

^R d 

Borel cr-algebra on 

23 

Gs ： F a 

Sets of type Gs or F a 

23 

M 

Non-measurable set 

24 

ct.G. 

Almost everywhere 

30 

f + (x), f~(x) 

Positive and negative parts of / 

31, 64 

A + B 

Sum of two sets 

35 

Vd 

Volume of the unit ball in 

39 

supp(/) 

Support of the function / 

53 

/fe / /, /fe \ / 

Increasing and decreasing sequences of func¬ 
tions 

62 

fh 

Translation by h of the function / 

73 

L 1 ㈣ ， Ll oc (R d ) 

Integrable and locally integrable functions 

69, 105 

f*9 

Convolution of / and g 

74 

fy, f x , EV, E x 

Slices of the function / and set E 

75 

/，，(/) 

Fourier transform of / 

87, 208 

r 

Maximal functions of / 

100, 296 

L (l) 

Length of the (rectifiable) curve 7 

115 

Tf, Pf, Np 

Total, positive, and negative variations of F 

117, 118 

L(A,B) 

Length of a curve between t = A and t = B 

120 

D+(F),...,D_(F) 

Dini numbers of F 

123 
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SYMBOL GLOSSARY 


M{I<) 

n+{6), Q-(S) 
L 2 (R d ) 

£ 2 (Z), £ 2 (N) 

n 

fig 

ro 

丑 2 (ID), H 2 (R 2 + ) 

cS 丄 

A®B 

Ps 

T' L* 

S{R d ) 

C 0 ，） 

c n (n), c n (n) 

Au 

(Xj) 

//， , flQ 
Ml X ll 2 

沪 -i 
ct, dcr(7) 
dF 

v + , v~ 

z/ 丄 /i 
V 《 A 
a ⑹ 

饥:⑻ 

diam S 
dim 五 

5 

B 

IC, K l 
dist(A, B) 

m 

巧， 7 

nf), TMf) 

尺 *(/), %(/) 


Minkowski content of K 138 

Outer and inner set of f] 143 

Square integrable functions 156 

Square summable sequences 163 

Hilbert space 161 

Orthogonal elements 164 

Unit disc 173 

Hardy spaces 174, 213 

Orthogonal complement of S 177 

Direct sum of A and B 177 

Orthogonal projection onto S 178 

Adjoint of operators 183, 222 

Schwartz space 208 

Smooth functions with compact support 222 

in f] 

Functions with n continuous derivatives on 223 

f] and f] 

Laplacian of u 230 

Measure space 263 

Measure, exterior measure, premeasure 263, 264, 270 
Product measure 276 

Unit sphere in 279 

Surface measure on the sphere 280 

Lebesgue-Stieltjes measure 282 

Total, positive, and negative variations of v 286, 287 

Mutually singular measures 288 

Absolutely continuous measures 289 

Spectrum of S 311 

Exterior a-dimensional Hausdorff measure 325 

Diameter of S 325 

Hausdorff dimension of E 329 

Sierpinski triangle 334 

A comparable to B 335 

Von Koch curves 338, 340 

Hausdorff distance 345 

Peano mapping 349 

Hyperplane 360 

Radon transform 363, 368 

Maximal Radon transform 363, 368 


Index 


Relevant items that also arose in Book I or Book II are listed in this 
index, preceeded by the numerals I or II, respectively. 


F a , 23 
G s , 23 
cr-algebra 
Borel, 23 
of sets, 23 
Borel, 267 
cr-finite, 263 

cr-finite signed measure, 288 
O notation, 12 

absolute continuity 

of the Lebesgue integral, 66 
absolutely continuous 
functions, 127 
measures, 288 
adjoint, 183, 222 
algebra of sets, 270 
almost disjoint (union), 4 
almost everywhere, a.e., 30 
almost periodic function, 202 
approximation to the identity, 109; 
(1)49 

arc-length parametrization, 136; 
(1)103 

area of unit sphere, 313 
area under graph, 85 
averaging problem, 100 
axiom of choice, 26, 48 

basis 

algebraic, 202 
orthonormal, 164 
Bergman kernel, 254 
Besicovitch 

covering lemma, 153 
set, 360, 362, 374 
Bessel’s inequality, 166; (1)80 
Blaschke factors, 227; (1)26, 153, 
219 


Borel 

cr-algebra, 23, 267 
measure, 269 
on IR, 281 
sets, 23, 267 

Borel-Cantelli lemma, 42, 63 
boundary, 3 

boundary-value function, 217 
bounded convergence theorem, 56 
bounded set, 3 
bounded variation, 116 
Brunn-Minkowski inequality, 34, 48 

canonical form, 50 
Cantor dust, 47, 343 
Cantor set, 8, 38, 126, 330, 387 
constant dissection, 38 
Cantor-Lebesgue 

function, 38, 126, 331, 387 
theorem, 95 

Caratheodory measurable, 264 
Cauchy 

in measure, 95 
integral, 179, 220; (11)48 
sequence, 159; (1)24; (11)24 
Cauchy-Schwarz inequality, 157, 
162; (1)72 
chain 

of dyadic squares, 352 
of quartic intervals, 351 
change of variable formula, 149; 
(1)292 

characteristic 
function, 27 
polynomial, 221, 258 
closed set, 2, 267; (11)6 
closure, 3 
coincidence, 377 
compact linear operator, 188 
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compact set, 3, 188; (11)6 
comparable, 335 
complement of a set, 2 
complete 
L 2 , 159 

measure space, 266 
mectric space, 69 
completion 

Borel cr-algebra, 23 
Hilbert space, 170; (1)74 
measure space, 312 
complex-valued function, 67 
conjugate Poisson kernel, 255 
continued fraction, 293, 322 
continuum hypothesis, 96 
contraction, 318 
convergence in measure, 96 
convex 

function, 153 
set, 35 

convolution, 74, 94, 253; (1)44, 139, 
239 

countable unions, 19 
counting measure, 263 
covering dimension, 383 
covering lemma 

Vitali, 102, 128, 152 
cube, 4 
curve 

closed and simple, 137; (1)102; 
( 11)20 
length, 115 
quasi-simple, 137, 332 
rectifiable, 115, 134, 332 
simple, 137, 332 
space-filling, 349, 383 
von Koch, 338, 340, 382 
cylinder set, 316 

d’Alembert’s formula, 224 
dense family of functions, 71 
difference set, 44 
differentiation of the integral, 99 
dimension 

Hausdorff, 329 
Minkowski, 333 
Dini numbers, 123 
Dirac delta function, 110, 285 
direct sum, 177 


Dirichlet 
integral, 230 
kernel, 179; (1)37 
principle, 229, 243 
problem, 230; (1)10, 28, 64, 170; 
(11)212, 216 
distance 

between two points, 2 
between two sets, 2, 267 
Hausdorff, 345 

dominated convergence theorem, 67 

doubling mapping, 304 

dyadic 

correspondence, 353 
induced mapping, 353 
rationals, 351 
square, 352 

Egorov’s theorem, 33 
eigenvalue, 186; (1)233 
eigenvector, 186 
equivalent functions, 69 
ergodic, (I) 111 

maximal theorem, 297 
mean theorem, 295 
measure-preserving 
transformation, 302 
pointwise theorem, 300 
extension principle, 183, 210 
exterior measure, 264 
Hausdorff, 325 
Lebesgue, 10 
metric, 267 

Fatou’s lemma, 61 
Fatou’s theorem, 173 
Fejer kernel, 112; (1)53, 163 
finite rank operator, 188 
finite-valued function, 27 
Fourier 

coefficient, 170; (1)16, 34 
inversion formula, 86; (1)141, 182; 
(11)115 

multiplier operator, 200, 220 
series, 171, 316; (1)34; (11)101 
transform in L 1 , 87 
transform in L 2 , 207, 211 
fractal, 329 

Fredholm alternative, 204 
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Fubini’s theorem, 75, 276 
function 

absolutely continuous, 127, 285 
almost periodic, 202 
boundary-value, 217 
bounded variation, 116, 154 
Cantor-Lebesgue, 126, 331 
characteristic, 27 
complex-valued, 67 
convex, 153 
Dirac delta, 110 
finite-valued, 27 
increasing, 117 
integrable, 59, 275 
jump, 132 

Lebesgue integrable, 59, 64, 68 
Lipschitz (Holder), 330; (1)43 
measurable, 28 
negative variation, 118 
normalized, 282 

nowhere differentiable, 154, 383 

positive variation, 118 

sawtooth, 200; (1)60, 83 

simple, 27, 50, 274 

slice, 75 

smooth, 222 

square integrable, 156 

step, 27 

strictly increasing, 117 
support, 53 
total variation, 117 
fundamental theorem of the 
calculus, 98 

Gaussian, 88; (1)135, 181 
good kernel, 88, 108; (1)48 
gradient, 236 

Gram-Schmidt process, 167 
Green’s 

formula, 313 
kernel, 204; (11)217 

Hardy space, 174, 203, 213 
harmonic function, 234; (1)20; (11)27 
Hausdorff 

dimension, 329 
distance, 345 
exterior measure, 325 
maximal principle, 48 


measure, 327 
strict dimension, 329 
heat kernel, 111; (1)120, 146, 209 
Heaviside function, 285 
Heine-Borel covering property, 3 
Hermite functions, 205; (1)168, 173 
Hermitian operator, 190 
Hilbert space, 161; (1)75 
L 2 , 156 

finite dimensional, 168 
infinite dimensional, 168 
orthonormal basis, 164 
Hilbert transform, 220, 255 
Hilbert-Schmidt operator, 187 
homogeneous set, 385 

identity operator, 180 
inequality 

Bessel, 166; (1)80 
Brunn-Minkowski, 34, 48 
Cauchy-Schwarz, 157, 162; (1)72 
iso-diametric, 328, 386 
isoperimetric, 143; (1)103 
triangle, 157, 162 
inner product, 157; (1)71 
integrable function, 59, 275 
integral operator, 187 
kernel, 187 
interior 
of a set, 3 
point, 3 

invariance of Lebesgue measure 
dilation, 22, 73 
linear transformation, 96 
rotation, 96, 151 
translation, 22, 73, 313 
invariant 

function, 302 
set, 302 
vectors, 295 

iso-diametric inequality, 328, 386 
isolated point, 3 
isometry, 198 

isoperimetric inequality, 143; (1)103, 
122 

jump 

discontinuity, 131; (1)63 
function, 132 
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Kakeya set, 362 
kernel 

Dirichlet, 179; (1)37 
Fejer, 112; (1)53 
heat, 111; (1)209 
Poisson, 111, 171, 217; (1)37, 55, 
149, 210; (11)67, 78, 109, 113, 
216 

Laplacian, 230 
Lebesgue 

decomposition, 150 
density, 106 
exterior measure, 10 
integrable function, 59, 64, 68 
integral, 50, 54, 58, 64 
measurable set, 16 
set, 106 

Lebesgue differentiation theorem, 
104, 121 

Lebesgue measure, 16 

dilation-invariance, 22, 73 
rotation-invariance, 96, 151 
translation-invariance, 22, 73, 313 
Lebesgue-Radon-Nikodym theorem, 
290 

Lebesgue-Stieltjes integral, 281 
Legendre polynomials, 205; (1)95 
limit 

non-tangential, 196 
point, 3 
radial, 173 
linear functional, 181 
null-space, 182 

linear operator (transformation), 

180 

adjoint, 183 
bounded, 180 
compact, 188 
continuous, 181 
diagonalized, 185 
finite rank, 188 
Hilbert-Schmidt, 187 
identity, 180 
invertible, 311 
norm, 180 
positive, 307 
spectrum, 311 
symmetric, 190 


linear ordering, 26, 48 
linearly independent 
elements, 167 
family, 167 

Lipschitz condition, 90, 147, 151, 
330, 362 

Littlewood’s principles, 33 
locally integrable function, 105 
Lusin’s theorem, 34 

maximal 

function, 100, 261 
theorem, 101, 297 
maximum principle, 235; (11)92 
mean-value property, 214, 234, 313; 

(1)152; (11)102 
measurable 

Caratheodory, 264 
function, 28, 273 
rectangle, 276 
set, 16, 264 
measure, 263 

absolutely continuous, 288 
counting, 263 
exterior, 264 
Hausdorff, 327 
Lebesgue, 16 
mutually singular, 288 
outer, 264 
signed, 285 
support, 288 
measure space, 263 
complete, 266 
measure-preserving 
isomorphism, 292 
transformation, 292 
Mellin transform, 253; (11)177 
metric, 267 

exterior measure, 267 
space, 266 
Minkowski 

content, 138, 151 
dimension, 333 
mixing, 305 

monotone convergence theorem, 62 
multiplication formula, 88 
multiplier, 220 
multiplier sequence, 186, 200 
mutually singular measures, 288 
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negative variation 
function, 118 
measure, 287 

non-measurable set, 24, 44, 82 
non-tangential limit, 196 
norm 

li(R d )，69 
L 2 (R d ), 157 
Euclidean, 2 
Hardy space, 174, 213 
linear operator, 180 
normal 

number, 318 
operator, 202 
normalized 

increasing function, 282 
nowhere differentiable function, 154, 
383 ; ⑴ 113, 126 

open 

ball, 2, 267 
set, 2, 267 
ordered set 
linear, 26, 48 
partial, 48 
orthogonal 

complement, 177 
elements, 164 
projection, 178 
orthonormal 
basis, 164 
set, 164 
outer 

Jordan content, 41 
measure, 10, 264 
outside-triangle condition, 248 

Paley-Wiener theorem, 214, 259; 

(11)122 

parallelogram law, 176 
Parseval’s identity, 167, 172; (1)79 
partial differential operator 
constant coefficient, 221 
elliptic, 258 
partitions of a set, 286 
Peano 

curve, 350 
mapping, 350 
perfect set, 3 


perpendicular elements, 164 

PlancherePs theorem, 208; (1)182 

plane, 360 

point in IR d , 2 

point of density, 106 

Poisson 

integral representation, 217; 

(1)57; (11)45, 67, 109 
kernel, 111, 171, 217; (1)37, 55, 
149, 210; (11)67, 78, 109, 113, 
216 

polar coordinates, 279; (1)179 
polarization, 168, 184 
positive variation 
function, 118 
measure, 287 

pre-Hilbert space, 169, 225; (1)75 

premeasure, 270 

product 

measure, 276 
sets, 83 

Pythagorean theorem, 164; (1)72 

quartic intervals, 351 
chain, 351 

quasi-simple curve, 332 
radial limit, 173 

Radon transform, 363; (1)200, 203 
maximal, 363 
rectangle, 3 

measurable, 276 
volume, 3 

rectifiable curve, 115, 134, 332 
refinement (of a partition), 116; 
(1)281, 290 

regularity of sets, 360 
regularization, 209 
Riemann integrable, 40, 47, 57; 
(1)31, 281, 290 

Riemann-Lebesgue lemma, 94 
Riesz representation theorem, 182, 
290 

Riesz-Fischer theorem, 70 
rising sun lemma, 121 
rotations of the circle, 303 

sawtooth function, 200; (1)60, 83 
self-adjoint operator, 190 
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self-similar, 342 

separable Hilbert space, 160, 162 
set 

bounded eccentricity, 108 
cylinder, 316 
difference, 44 
self-similar, 342 
shrink regularly, 108 
slice, 75 

uniformly locally connected, 387 
shift, 317 

Sierpinski triangle, 334 
signed measure, 285 
similarities 
separated, 346 
similarity, 342 
ratio, 342 
simple 

curve, 332 

function, 27, 50, 274 
slice, 361 
function, 75 
set, 75 

smooth function, 222 

Sobolev embedding, 257 

space L 1 of integrable functions, 68 

space-filling curve, 349, 383 

span, 167 

special triangle, 248 
spectral 
family, 306 
resolution, 306 
theorem, 190, 307; (1)233 
spectrum, 191, 311 
square integrable functions, 156 
Steiner symmetrization, 386 
step function, 27 
strong convergence, 198 
Sturm-Liouville, 185, 204 
subspace 


closed, 175 
linear, 174 
support 

function, 53 
measure, 288 
symmetric 
difference, 21 
linear operator, 184, 190 

Tchebychev inequality, 91 
Tietze extension principle, 246 
Tonelli’s theorem, 80 
total variation 
function, 117 
measure, 286 
translation, 73; (1)177 

continuity under, 74; (1)133 
triangle inequality, 157, 162, 267 

uniquely ergodic, 304 
unit disc, 173; (11)6 
unitary 

equivalence, 168 
isomorphism, 168 
mapping, 168; (1)143, 233 

Vitali covering, 102, 128, 152 
volume of unit ball, 92, 313; (1)208 
von Koch curve, 338, 340, 382 

weak 

convergence, 197, 198 
solution, 223 

weak-type inequality, 101, 146, 161 
weakly harmonic function, 234 
well ordering 
principle, 26, 48 
well-ordered set, 26 
Wronskian, 204 


